solr - how to crawl a website by specifying depth -


I am using nutch 2.x. So I'm trying to use the nutch command with deeper options

$: nutch Injection ./urls/seed.txt -depth 5

To execute this order After receiving messages like

Unrecognized Arg-Deep

So when I failed on this, I tried to use the nutch crawl

$: Nutch crawl ./urls/seed.txt -depth 5

is being like an error

The command crawl has been deprecated, please use bin / crawl instead.

e So I tried to use the crawl command to crawl the URL in the CRP. In that case the depth option is asking for solr but I am not using solr

so my question is

My question is, what do you do by crawling the page Want and list it in SORR Do not smoke?

Answer your question:

If you want to use Nutch crawler and you want to list it in SOLR, remove the following piece of code from the crawl script.

Answer to another question:

Be sure to get HTML content for all links that have been crawled by nach (this link ):

This will definitely solve your issue.


Comments

Popular posts from this blog

sqlite3 - UPDATE a table from the SELECT of another one -

c# - Showing a SelectedItem's Property -

javascript - Render HTML after each iteration in loop -