All the services offered by EZClassifier can be accessed using the command through the ezc
command.
See the getting started section for the instructions for downloading and installing it.
General syntax
For java JRE 17+ is required
The command returns 0 on success or a value > 0 on failure. The logs are written on stdout by default and can be redirected.
All commands requires a valid api key in the environment variable API_KEY
export TC_API_KEY=<api-key>
Usage:
java -jar <path of the downloaded jar file> [-hV] [COMMAND]
A software agent that classifies text based on the provided prototypes
Options:
-h, --help
Show this help message and exit.-V, --version
Print version information and exit.
Commands:
model
Manage modelsclassify
Perform classification
Command model train
Train the model
model train [-hHSV] -C=<labelIndex> [-e=<apiEndpoint>]
[-i=<inputFilename>] [-k=<apiKey>] -n=<name> -P=<textIndex>
[-W=<weightIndex>]
You can enhance an already trained model by adding new examples through multiple calls to the “model train” command.
Options:
-C, --class-index=<labelIndex>
The column index in the CSV file that contains the field with the class attached to the text (from 0). By default is 1-e, --endpoint=<apiEndpoint>
Api endpoint. By defaulthttps://api.mopso.io/v1/tc
-h, --help
Show this help message and exit.-H, --header
Will ignore the first line of the CSV file.-i, --input=<inputFilename>
The file containing the training data, by default “-” that means std in (e.g. -i - ). The stream is supposed to be in CSV format and MUST contain two fields (“prototype” and “class”) and some additional fields.-k, --api-key=<apiKey>
A registered API key. If not present, the value from the env variable TC_API_KEY is used.-n, --name=<name>
Name of the model to train, create a new model if the name is not found.-P, --prototype-index=<textIndex>
The column index in the CSV file that contains the field with the text to classify (from 0)-S, --strict
Runs the program in strict mode: any partially recoverable exception thrown during the execution will stop the training process. If not run in strict mode, the application will try to compensate for as many errors as possible.-V, --version
Print version information and exit.-W, --weight-index=<weightIndex>
The column index in the CSV file that contains the field with the classification weight (from 0). Set to -1 if not present.
Command model ls
List models
model ls [-hV] [-e=<apiEndpoint>] [-k=<apiKey>]
Options:
-e, --endpoint=<apiEndpoint>
Api endpoint. By defaulthttps://api.mopso.io/v1/tc
-h, --help
Show this help message and exit.-k, --api-key=<apiKey>
A registered API key. If not present, the value from the env variable TC_API_KEY is used.-V, --version
Print version information and exit.
Command model rm
model rm [-hV] [-e=<apiEndpoint>] [-k=<apiKey>] -n=<name>
Models that are not used for more than 3 month are automatically deleted.
Options:
-e, --endpoint=<apiEndpoint>
Api endpoint. By defaulthttps://api.mopso.io/v1/tc
-h, --help
Show this help message and exit.-k, --api-key=<apiKey>
A registered API key. If not present, the value from the env variable TC_API_KEY is used.-n, --name=<name>
Name of the model to remove.-V, --version
Print version information and exit.
Command classify
Usage:classify [-hHSV] [-e=<apiEndpoint>] [-i=<inFilename>] [-I=<textIndex>]
[-k=<apiKey>] -n=<name> [-o=<outFilename>] [-t=<threshold>]
[-T=<threads>]
Perform classification
-e, --endpoint=<apiEndpoint>
Api endpoint. By defaulthttps://api.mopso.io/v1/tc
-h, --help
Show this help message and exit.-H, --header
If the flag is present, the first line of the input file is copied to the output with added columns ‘CLASS’ and ‘SIMILARITY_SCORE’. By default, it is assumed that the input has no header.-i, --input=<inFilename>
The input filename; “-” means stdin (e.g. -i - ). The file must be in CSV format.-I, --index=<textIndex>
The index in the CSV file that contains the field to classify (from 0). By default, the index is 0.-k, --api-key=<apiKey>
A registered API key. If not present, the value from the env variable TC_API_KEY is used.-n, --name=<name>
Model name.--no-buffer
Execute the program in interactive mode. Will ignore –input, –output and –header options.-o, --output=<outFilename>
The output filename; “-” means stdout (e.g. -o - ).-S, --strict
Runs the program in strict mode: any partially recoverable exception thrown during the execution (i.e. a classification that fails or a row that can’t be parsed) will stop the program, truncating the output to the last stable state. If not run in strict mode, the application will try to compensate for as many errors as it’s possible.-t, --threshold=<threshold>
Number between 0 (not included) and 1 (included) that is used to determine whether a match ‘SIMILARITY_SCORE’ is too low to be considered valid. In this case, the ‘CLASS’ is set to ‘OTHER’. Default value is 0.84.-T, --threads=<threads>
The number of parallel jobs to be used by the classification services, by default is 1. If more than 1 is used, the output order is not preserved. The value is capped to the number of CPU cores.-V, --version
Print version information and exit.