This week I faced a weird error message on my “Search Service Application“, where the crawl was not performed.
Every time the crawl would try to index data, the error message below was shown and the crawl log showed 0 (zero) success and 1(one) error Top Level…:
“http://globalintranet.mabotega.local.
The item could not be accessed on the remote server because its address has an invalid syntax. ( SearchID = F1312B59-3AE6-4121-91BD-74B774BE0D07 )”
So, I need to find out a way to check if our crawl system was functioning and the answer was the Fiddler2 (http://fiddler2.com/).
On this post, I will try to describe how to troubleshoot the SharePoint Search crawl process with the Fiddler tool to help you to check if the crawl is gathering data from the web application (web sites) set up in some content source.
In my environment I am not using Proxy on Search Configuration as shown on the screenshot below, but after set up it with Fiddler as Proxy and removed it from Search Configuration, my Search returned to crawl content for my web sites…. Too much weird!!!! Probably some dirt on my config database, I guess…. This is my only explanation….
Ok, lets finish the talk to follow the technical steps below:
SharePoint Search – Crawl Troubleshooting
Using this incredible tool called Fiddler, we are going to configure SharePoint Search to crawl through Fiddler as a Proxy so we can watch the traffic on it and check if the system is gathering data from our web sites. Using this technique you will be able to watch if the behavior of the crawl system is acting correctly or unexpectedly.
Necessary Steps:
- Download and install Fiddler on the server running the crawl (http://fiddler2.com/). For this, follow the Next, Next, Finish process;
- Determine which account is running the crawl. Usually it will be the Default content access account listed in Search Administration:
- If you have Crawl Rules set up for specific content sources you may have alternate credentials specified, so check your rules and be sure you are using the correct account for testing. In my case, I don´t have it, but be sure;
- Start the Fiddler, holding down the [Ctrl][Shift] keys and right click Fiddler to choose “Run as different user”. Log in as the Crawl Account or the account for your rule that will be checked. (If this option is not available, you may have to log out and log back in as the crawl account. Either way you need to run Fiddler as the crawl account):
In my scenario, my default access account is RBTCRAWL:
- Once Fiddler is running choose Tools | Fiddler Options… and click the Connections tab. Note the Fiddler listens on port: setting. 8888 is the default. Ensure that it does not duplicate a port already in use by SharePoint or IIS. Close the dialog after making any necessary adjustments to the port. In my scenario, I don´t have any web site running on the 8888 port, so I leave the default;
- Open a browser and go to http://localhost:8888 (or whatever your port number is for Fiddler) and you should see something like the following indicating that you are set up correctly.
- To configure SharePoint to use Fiddler as Proxy, please, return to Search Administration and choose the link for Proxy Server from the System Status section. In my scenario I am not using Proxy:
- Configure SharePoint to use Fiddler by choosing Use the proxy server specified and adding the address and port.
- Click OK to save your settings.
- Start the crawl for the content source that you are having issues with by choosing Content Sources. Select the content source and choose Start Full Crawl.
- Once the crawl starts you should begin to see activity in Fiddler. In the example below I am crawling some web applications. The crawler always looks for a robots.txt file first to read the web site map (even if it was not set up). In my case I don’t have one, so Fiddler displays the 404 results for it (1). Crawling a SharePoint site you will notice that the crawler uses the “SiteData service” (2) to gather information about the site from SharePoint. Following that you will see some results for each request performed by the crawl system:
- Once you are done testing be sure to reset the Proxy settings in the Search Application to return to the previously configuration:
- Armed with the results of the Fiddler trace you can see the conversation that SharePoint is having with the content source that you are troubleshooting.
- Troubleshooting SharePoint Search can be a big challenge and I really hope that this post can help you to get information if your Search are crawling or not.
No comments:
Post a Comment