Jump to content


Jaybone

Random PXE boot failures

Recommended Posts

Hi, all. We're currently seeing a strange issue with OSD and new systems.


What's happening is that probably 95% or more of new systems have zero problems - PXE boot, pull down a task sequence and all's well.


The remainder, though... they'll start the PXE process seemingly fine, but get stuck.



"contacting server: w.x.y.z........................................ <imagine two more lines of ... here>

Failed to restart TFTP.

TFTP download failed.

PXE-M0F: Exiting Intel Boot Agent.

Selected boot device failed. Press any key to reboot the system."



SMSPXE.log shows repeated "Looking for bootimage XXX00004" messages at this point, and will do so until the client is turned off or eventually times out.


While this is happening, other systems, both existing and new, can PXE boot into the OSD environment just fine, and use the exact same boot image (which shows up in SMSPXE.log as "Looking for bootimage XXX00004" exactly one time for that system) while they're going through the PXE process. Minutes, hours, or days later, the problematic systems will all of a sudden start behaving normally, and complete the boot process.


Googling on this has come up with s number of promising results, but they seem to all be related to people who are having this problem with everything, not just a tiny percentage of systems, and I'm not sure the restarting or reinstalling WDS is the answer, since it seems so random.


Anyone have any ideas, or know where to look to nail down what's causing this?


post-24005-0-62056000-1394733533_thumb.png

Share this post


Link to post
Share on other sites


Pxe booting depends a lot on the nic card in that pc. Update the bios of the pc's in question. If these are older computers put in a new nic or go to bios and make sure all options for nic card are correct, pxe ability, turned on etc. One last thing have your network guys make sure portfast is set on all of you switches.

 

Your pic above shows clearly the pc is not contacting the wds service on the sccm server, due to too much activity or simply times out. Make sure in your registry of the sccm server find wds settings and double the packet size, you can Google that and find the exact key to adjust. Finally check your network card on your pc, is it optimized, have correct dns, ping and do you get good reaction times, does dns work by name and by ip, can you do nslookup by both also. Do you have dns set up correctly on all of your dns servers, you need an entry for you sccm server.

Share this post


Link to post
Share on other sites

Thanks for the responses, all.

 

Do you have SCCM 2012 R2? If so, install the following hotfix

 

http://support.microsoft.com/kb/2910552

 

Nope, this is 2012, SP1.

 

Pxe booting depends a lot on the nic card in that pc. Update the bios of the pc's in question. If these are older computers put in a new nic or go to bios and make sure all options for nic card are correct, pxe ability, turned on etc. One last thing have your network guys make sure portfast is set on all of you switches.

 

Your pic above shows clearly the pc is not contacting the wds service on the sccm server, due to too much activity or simply times out. Make sure in your registry of the sccm server find wds settings and double the packet size, you can Google that and find the exact key to adjust. Finally check your network card on your pc, is it optimized, have correct dns, ping and do you get good reaction times, does dns work by name and by ip, can you do nslookup by both also. Do you have dns set up correctly on all of your dns servers, you need an entry for you sccm server.

 

The ones we see this with most are brand new Dell Optiplex 7010 units, BIOS A16 (latest available). Identical configs on all of them, and some are flawless while others are wonky. Same switchports, same cabling, different results. Clients' network settings are triple+ checked and known good, DNS working well.

 

 

As far as not contacting wds: it sure looks that way from the client end, but the logs seem to indicate otherwise, with the repeated "Looking for bootimage..." entries in SMSPXE while this is happening. This is what has me so confused.

 

I'll tweak the packet size settings and see if that makes a difference, thanks.

Share this post


Link to post
Share on other sites

Have you distributed BOTH boot images? Or just the one?

 

Both are distributed. Only the x86 image is referenced by any TS.

That's something I didn't consider, though - if the 64 bit image is hosed up, maybe that could cause problems?

In SMSPXE.log, the repeated "Looking for bootimage XXX00004" entries are referencing the x86 boot image - the 64bit image is XXX00005.

Share this post


Link to post
Share on other sites

I think i have seen that message too. And it might very well happened with the Dell optiplex 7010. we deployed a bunch on them with windows 7 32bit. However i don't remember how we solved it. :(

 

Almost every time we deploy a new model we face some troubles.

 

I would check on your 64bit boot image, even if you are deploying x86, the 64bit needs to be distributed as well.

Share this post


Link to post
Share on other sites

~~~much time passes~~~

 

After both messing with packet size settings and redistributing both the x86 and 64 boot images, we haven't run into this again for probably close to four weeks. We also haven't distributed a whole lot of new systems in that time, so I still don't have a definite answer, but it's looking good so far.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...