Just installed Solr, edited the schema.xml, and am now trying to index it and search on it with some test data.
In the XML file I'm sending to Solr, one of my fields look like this:
some text in a paragrah tag
]]>There's HTML there, so I've wrapped it in CDATA.
In my Solr schema.xml, the definition for that field looks like this:
When I ran the POSTing tool, everything went ok, but when I search for content which I know is inside the PageContent field, I get no results.
However, when I set the node to PageContent, it works. But if I set it to any other field, it doesn't search in PageContent.
Am I doing something wrong? what's the issue?
To clarify on the error:
I've uploaded a "doc" with the following data:
928
some name
html content
]]>In my schema I've defined the fields as such:
And:
PageID
PageName
Now, when I use the Solr admin tool and search for "some name" I get a result. But, if I search for "html content", "html", "content" or "928", I get no results
Why?
解决方案
You mentioned that your default search field is set to PageName, I wouldn't expect a search for "content" to return anything.
You probably meant to put "PageContent:content" in the search box to find data in that field. If you want to search against multiple fields you'll want to check this out http://wiki.apache.org/solr/DisMaxRequestHandler. The solr admin console is not that great of a tool to play around with all the DisMax search options, you'll want to just manipulate the URL for that.
Regardless, I agree with the previous poster, if your analysis setup isn't setup up properly to deal with HTML you are likely to get all sorts of unexpected search results. Strip the HTML out and index text only.
If you want your standard query handler to search against all your fields you can change it in your solrconfig.xml (I always add a second query handler instead of modifying "standard". The qf field is the list of fields you want to search against. It's a space separated list.
all
true
*
PageName PageContent