Since there isn't much sample code out there that demonstrates how you can use Java libraries, I thought I would post some!
This code make use of LingPipe, a java library for Natural Language Processing.. This code splits a piece of text into sentences..
To install, download LingPipe, drop the .jar files into java->lib in the plug-ins folder of Illustrator..
The code I converted to javascript: SentenceChunkerDemo.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
importPackage(Packages.com.aliasi.sentences, Packages.com.aliasi.tokenizer, Packages.com.aliasi.chunk);
var text= new java.lang.String("This text is a test text. It's function is to be tested. What do you think of that, mr. test text? "I don't mind." said the test text.");
var ca= text.toCharArray();
var TokenizerFactory=new IndoEuropeanTokenizerFactory;
var SentenceModel = new MedlineSentenceModel;
var SentenceChunka = new SentenceChunker(TokenizerFactory,SentenceModel)
var chunking = SentenceChunka.chunk(ca,0,text.length());
var sentences=chunking.chunkSet();
var slice = chunking.charSequence().toString();
var i=1;
for(var it = sentences.iterator(); it.hasNext(); ){
var sentence = it.next();
var start = sentence.start();
var end = sentence.end();
print("SENTENCE "+(i++)+":");
print(slice.substring(start,end))
}
This script produces:
SENTENCE 1:
This text is a test text.
SENTENCE 2:
It's function is to be tested.
SENTENCE 3:
What do you think of that, mr. test text?
SENTENCE 4:
"I don't mind." said the test text.