Jump to content


kcorrie

Is there a doctor in the house? My SCCM is sick!

Recommended Posts

Hello,

I have some concerns about my SCCM environment and I hope I can get some help resolving them.  Some info about my setup first:

ConfigMgr 1706
Single stand-alone primary site
- Site server is 2012 R2 and also has MP and DP roles
- Remote DB server is 2012 R2 and SQL Server 2014
- Remote WSUS/SUP is Server 2016
- IBCM server is 2012 R2 and has MP and DP roles

We've been using SCCM at my workplace for over a year now and have been slowing transitioning into it from Kace.  I manage SCCM alone and come into it with little to no previous experience.

My top concern is inventories are not working and software updates are affected as a result.  As you can see in the attached image, my update deployment for November is showing 85% unknown status.

Looking at component status in monitoring work space, there are several components with critical and warning statuses.

Critical:
Client config manager
MP control manager
State system

Warning:
Fallback status point
Inventory data loader
Site backup
Software inventory processor
MP control manager

How do I set the priority for resolving these components and where do I get started on each?  I realize troubleshooting each component is a single topic on it's own but I would appreciate some direction to get started.  Once I have a priority set, I may create a new topic each time I start working on a component and reference this topic like a parent/child relationship.

 

Thanks!

UpdateDeploymentStats.JPG

ComponentStatus.JPG

Share this post


Link to post
Share on other sites

hi and Welcome !

the first thing to fix is your SMS_MP_CONTROL_MANAGER, you need to find out what is wrong with the management point (MP) as that is how your clients communicate back to SCCM, so right click on the component, show messages, all, and see what it tells you, those errors need to be fixed first and then we'll move on towards the other issues, feel free to post the errors here and we'll do what we can to help

cheers

niall

Share this post


Link to post
Share on other sites

Thanks for the quick response.  I figured MP would be the place to start but didn't want to be taking shots in the dark.

Showing all messages during the last day for SMS_MP_CONTROL_MANAGER, I see these warnings a lot:
ID 5446 - MP has rejected the request because certificate has expired.
ID 5413 - MP has discarded a report when processing Relay.  Possible cause: Corruption or invalid user definition.
ID 5447 - MP has rejected a message because the signature could not be validated. If this is a valid client, it will attempt to re-register automatically so its signature can be correctly validated.

* When SCCM was setup originally we had clients using PKI certs and that would show for "Client certificate" in Configuration Manager applet in Control Panel.  Now "Client certificate" shows self-signed.  I believe this has something to so with a support call I had with MS a few months back to resolve a problem with clients getting updates.  Support was messing with HTTP versus HTTPS connections and I think messed up our certificate settings in SCCM.  Would these warning messages be results of clients certs mismatch?

This error was logged three times in a row last night at 11:30:
ID 5420 - Management Point encountered an error when connecting to the database CMDB on SQL Server SCCMDB. The OLEDB error code was 0x80004005.

* The message log shows this happening almost nightly.  I think this has to do with our nightly backups.  I will make an adjustment to the backup schedule for the DB server and see if the errors change or go away.

This error was logged 20 times on three separate days in the last month:
ID 5436 - MP Control Manager detected management point is not responding to HTTP requests.  The HTTP status code and text is 500, Internal Server Error.

* This happened once Nov 8, 18 times Nov 12, and once Nov 26.

 

I think I'll be able to resolve 5420 on my own.  What do you think about 5436 and the warnings about rejected requests/messages?

Share this post


Link to post
Share on other sites

I'm still working on SMS_SITE_BACKUP and the SQL connection errors in SMS_MP_CONTROL_MANAGER.  I had hoped changes I made on Friday would fix the problems but I'm still seeing them today.

How do I go about fixing the remaining problems in SMS_MP_CONTROL_MANAGER?  The warnings are mostly rejected messages because of invalid or expired certs.  The descriptions contain SMSID's.  Do I need to use that to track down the clients and view their logs for more detail?

Share this post


Link to post
Share on other sites

3 hours ago, GarthMJ said:

if you review the status messages it will give you suggestions for each error have you done these suggestions?

In SMS_MP_CONTROL_MANAGER:

Message ID 5446 (MP has rejected the request because CD(SMSID = XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX) certificate has expired.) appears to be cause by our IBCM server.  I wonder if this is because we are no longer using PKI certs?  Can you tell me where in the console to enable the setting to start using our PKI cert again rather than self-signed?

Message ID 5447 (MP has rejected a message from GUID:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX because the signature could not be validated. If this is a valid client, it will attempt to re-register automatically so its signature can be correctly validated.) isn't clear to me.  When I search on the GUID's given, I see a mix of active and inactive clients.  LocationServices.log on the client is not helpful.  I see some "1 assigned MP errors in the last 10 minutes, threshold is 5." messages scattered but nothing major.  MPControl.log on the site server is not helpful either.  I only see "ReadMPStringSettings(): RegQueryValueExW() failed - 0x80070002" a handful of times in the last 24 hours.  Is there something else I should be looking at?

Share this post


Link to post
Share on other sites

I'd really appreciate any direction someone can give to resolving this MP issue.

Our prep bench has also had recent trouble PXE booting computers for OS deployment, and I wonder if it's related to MP health.  We're currently working on replacing DHCP options 66 and 67 with IP helpers as recommended by Microsoft as a possible resolution but I'm concerned that's not really the problem.

 

Thanks

Share this post


Link to post
Share on other sites

I'm still trying to figure this out.

CPU usage on the site server is hanging out around 75-80% and occasionally dips to around 40% on six vCPUs.  Not sure what's expected normal here.

IIS log files are out of control and growing GB's per day.  Glancing through the latest log shows mostly 200's but I see 503's scattered throughout.  Why would the logs be so big?

We're managing 14,000+ clients which are set to check for new policies every 240 minutes--I recently increased this from every 60 minutes to try and take some load off since we are not making many policy changes right now.

CaptureLogFiles.JPG

Share this post


Link to post
Share on other sites

6 minutes ago, GarthMJ said:

it sounds like your site is not healthy.  Test you MP is it working correctly?

https://www.enhansoft.com/blog/how-to-test-your-mp-to-confirm-if-it-is-healthy

if not remove it and add it back again. 

Thanks for the suggestion, but my MP passed this test.  I tried the URL on my computer and another computer I suspect of having problems communicating with the MP.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...



×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.