This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Command reference

All EZClassifier command syntax and options

    All the services offered by EZClassifier can be accessed using the command through the ezc command. See the getting started section for the instructions for downloading and installing it.

    General syntax

    For java JRE 17+ is required

    The command returns 0 on success or a value > 0 on failure. The logs are written on stdout by default and can be redirected.

    All commands requires a valid api key in the environment variable API_KEY

    export TC_API_KEY=<api-key>
    

    Usage:

    java -jar <path of the downloaded jar file> [-hV] [COMMAND]
    

    A software agent that classifies text based on the provided prototypes

    Options:

    • -h, --help
      Show this help message and exit.

    • -V, --version
      Print version information and exit.

    Commands:

    • model
      Manage models

    • classify
      Perform classification

    Command model train

    Train the model

    model train [-hHSV] -C=<labelIndex> [-e=<apiEndpoint>]
                        [-i=<inputFilename>] [-k=<apiKey>] -n=<name> -P=<textIndex>
                        [-W=<weightIndex>]
    

    You can enhance an already trained model by adding new examples through multiple calls to the “model train” command.

    Options:

    • -C, --class-index=<labelIndex>
      The column index in the CSV file that contains the field with the class attached to the text (from 0). By default is 1

    • -e, --endpoint=<apiEndpoint> Api endpoint. By default https://api.mopso.io/v1/tc

    • -h, --help
      Show this help message and exit.

    • -H, --header
      Will ignore the first line of the CSV file.

    • -i, --input=<inputFilename>
      The file containing the training data, by default “-” that means std in (e.g. -i - ). The stream is supposed to be in CSV format and MUST contain two fields (“prototype” and “class”) and some additional fields.

    • -k, --api-key=<apiKey>
      A registered API key. If not present, the value from the env variable TC_API_KEY is used.

    • -n, --name=<name>
      Name of the model to train, create a new model if the name is not found.

    • -P, --prototype-index=<textIndex>
      The column index in the CSV file that contains the field with the text to classify (from 0)

    • -S, --strict
      Runs the program in strict mode: any partially recoverable exception thrown during the execution will stop the training process. If not run in strict mode, the application will try to compensate for as many errors as possible.

    • -V, --version
      Print version information and exit.

    • -W, --weight-index=<weightIndex>
      The column index in the CSV file that contains the field with the classification weight (from 0). Set to -1 if not present.

    Command model ls

    List models

    model ls [-hV] [-e=<apiEndpoint>] [-k=<apiKey>]
    

    Options:

    • -e, --endpoint=<apiEndpoint> Api endpoint. By default https://api.mopso.io/v1/tc

    • -h, --help
      Show this help message and exit.

    • -k, --api-key=<apiKey>
      A registered API key. If not present, the value from the env variable TC_API_KEY is used.

    • -V, --version
      Print version information and exit.

    Command model rm

    model rm [-hV] [-e=<apiEndpoint>] [-k=<apiKey>] -n=<name>
    

    Models that are not used for more than 3 month are automatically deleted.

    Options:

    • -e, --endpoint=<apiEndpoint> Api endpoint. By default https://api.mopso.io/v1/tc

    • -h, --help
      Show this help message and exit.

    • -k, --api-key=<apiKey>
      A registered API key. If not present, the value from the env variable TC_API_KEY is used.

    • -n, --name=<name>
      Name of the model to remove.

    • -V, --version
      Print version information and exit.

    Command classify

    Usage:
    classify [-hHSV] [-e=<apiEndpoint>] [-i=<inFilename>] [-I=<textIndex>] [-k=<apiKey>] -n=<name> [-o=<outFilename>] [-t=<threshold>] [-T=<threads>]

    Perform classification

    • -e, --endpoint=<apiEndpoint> Api endpoint. By default https://api.mopso.io/v1/tc

    • -h, --help
      Show this help message and exit.

    • -H, --header
      If the flag is present, the first line of the input file is copied to the output with added columns ‘CLASS’ and ‘SIMILARITY_SCORE’. By default, it is assumed that the input has no header.

    • -i, --input=<inFilename>
      The input filename; “-” means stdin (e.g. -i - ). The file must be in CSV format.

    • -I, --index=<textIndex>
      The index in the CSV file that contains the field to classify (from 0). By default, the index is 0.

    • -k, --api-key=<apiKey>
      A registered API key. If not present, the value from the env variable TC_API_KEY is used.

    • -n, --name=<name>
      Model name.

    • --no-buffer Execute the program in interactive mode. Will ignore –input, –output and –header options.

    • -o, --output=<outFilename>
      The output filename; “-” means stdout (e.g. -o - ).

    • -S, --strict
      Runs the program in strict mode: any partially recoverable exception thrown during the execution (i.e. a classification that fails or a row that can’t be parsed) will stop the program, truncating the output to the last stable state. If not run in strict mode, the application will try to compensate for as many errors as it’s possible.

    • -t, --threshold=<threshold>
      Number between 0 (not included) and 1 (included) that is used to determine whether a match ‘SIMILARITY_SCORE’ is too low to be considered valid. In this case, the ‘CLASS’ is set to ‘OTHER’. Default value is 0.84.

    • -T, --threads=<threads>
      The number of parallel jobs to be used by the classification services, by default is 1. If more than 1 is used, the output order is not preserved. The value is capped to the number of CPU cores.

    • -V, --version
      Print version information and exit.