For English Text:
Under DOS, go to the directory where the parser is located, then type the line below:
lexparser.bat input.txt >output.txt
Then, enter to get your result.
For processing Chinese texts
Firstly, you need segement the input text (search ICTCLAS in this forum if you don't have). That is, convert 今天真热。to 今天 真 热 。
Then save the segmented text in GB format (not UTF-8, which is used for the GUI/windows version).
Next, creat a bat file by copying and pasting the following lines (between the equal signs in blue) to your notepad, and save it with a name of lexparserCh.bat to the same folder where your parser program is:
=============================
@echo off
:: Runs the Chinese PCFG parser on one or more files, printing trees only
:: usage: lexparser fileToparse
java -server -mx800m -cp "stanford-parser.jar;" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "penn,typedDependenciesCollapsed" chineseFactored.ser.gz %1
=============================
Finally, go to the directory where the parser is located, and type the line below:
lexparserCh.bat inputCh.txt >outputCh.txt
Then, enter to get your result.
Under DOS, go to the directory where the parser is located, then type the line below:
lexparser.bat input.txt >output.txt
Then, enter to get your result.
For processing Chinese texts
Firstly, you need segement the input text (search ICTCLAS in this forum if you don't have). That is, convert 今天真热。to 今天 真 热 。
Then save the segmented text in GB format (not UTF-8, which is used for the GUI/windows version).
Next, creat a bat file by copying and pasting the following lines (between the equal signs in blue) to your notepad, and save it with a name of lexparserCh.bat to the same folder where your parser program is:
=============================
@echo off
:: Runs the Chinese PCFG parser on one or more files, printing trees only
:: usage: lexparser fileToparse
java -server -mx800m -cp "stanford-parser.jar;" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "penn,typedDependenciesCollapsed" chineseFactored.ser.gz %1
=============================
Finally, go to the directory where the parser is located, and type the line below:
lexparserCh.bat inputCh.txt >outputCh.txt
Then, enter to get your result.
It's quite similar actually. Anyway, firstly creat a bat file by copying and pasting the following lines (between the equal signs in blue) to your notepad:
=============================
@echo off
:: To tag a file using the pre-trained bidirectional model
:: usage: postagger.bat inputfile
java -mx300m -classpath postagger-2006-05-21.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -model wsj3t0-18-bidirectional/train-wsj-0-18 -file input.txt >output.txt
=============================
Next, save it as a plain text file with the name of postagger.bat to the same folder where your Standford POS Tagger program is;
Then, save an English text with the name as input.txt to the same folder where the Tagger and postagger.bat are;
Finally, go to the folder where the Tagger, the postagger.bat and the input.txt are located, and double click the postagger.bat file to get your result file output.txt.
To tag another file, simply rename output.txt, and change the content of the input.txt file.
Good luck!
=============================
@echo off
:: To tag a file using the pre-trained bidirectional model
:: usage: postagger.bat inputfile
java -mx300m -classpath postagger-2006-05-21.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -model wsj3t0-18-bidirectional/train-wsj-0-18 -file input.txt >output.txt
=============================
Next, save it as a plain text file with the name of postagger.bat to the same folder where your Standford POS Tagger program is;
Then, save an English text with the name as input.txt to the same folder where the Tagger and postagger.bat are;
Finally, go to the folder where the Tagger, the postagger.bat and the input.txt are located, and double click the postagger.bat file to get your result file output.txt.
To tag another file, simply rename output.txt, and change the content of the input.txt file.
Good luck!
The PosTagger was trained for English texts, though it's said you can train it to tag Chinese texts. However, it may be difficult for many of us to do so. It'd be good to use ICTCLAS_Win.exe to tag your Chinese tests. You can download it under "NLP Tools" in my online storage at:
http://corpuslaohong.ys168.com/
Password: corpus4u
Leave a message there after you got it.
If you do want to tag Chinese texts with Standford tools, the Standford Parser can also produce the POS information for Chinese texts. Read my instruction on how to parse a Chinese text with Standford Parser in earlier posts.
http://corpuslaohong.ys168.com/
Password: corpus4u
Leave a message there after you got it.
If you do want to tag Chinese texts with Standford tools, the Standford Parser can also produce the POS information for Chinese texts. Read my instruction on how to parse a Chinese text with Standford Parser in earlier posts.
__________________