This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Getting Started

What you need to know to use EZClassifier

    Data classification is the process of organizing and categorizing data based on specific criteria or attributes. It helps make data more structured, accessible, and understandable.

    Let’s see an example

    Suppose you need to categorize a set of texts that contain mixed references to cats, actors, dogs, and other things that don’t matter.

    You will need a CSV file containing a few examples. In this file, each example corresponds to a row that provides at least two fields:

    • prototype: a text that exemplifies a typical element in the category specified in the second field
    • class: a short text that represent a category name

    These examples are used by EZClassifier to create a personalized model. You can add to the model as many prototypes and categories as you like, as long as there is at least one example for each category. EZClassifier works with any language, even when used simultaneously in the same text or the same data file and/or examples.

    Once you created your model, you use it to classify your data stream. EZClassifier will add two additional fields to your data:

    • class: that is the predicted category for the described row
    • similarity: a number from 0 to 1 that represents the confidence of EZClassifier about its classification (1=maxumum confidence , 0=no confidence).

    Here are some examples you can use to train your model:

    prototypeclass
    Persian Cat: Known for their long, luxurious fur and sweet temperament, Persian cats are one of the most popular breedscat
    Maine Coon: These are among the largest domestic cat breeds. They have tufted ears, a bushy tail, and a friendly, gentle personalitycat
    Siamese Cat: Siamese cats are known for their striking blue almond-shaped eyes, short coat, and vocal naturecat
    Ragdoll Cat: Ragdolls are large, affectionate cats known for their tendency to go limp when you hold them, hence the name Ragdollcat
    German Shepherd: Intelligent and versatile, often used in police and military workdog
    Rottweiler: Strong and loyal, originally bred for herding and guardingdog
    Siberian Husky: Known for their endurance and striking appearance, used as sled dogsdog
    Doberman Pinscher: Agile and protective, often used as guard dogsdog
    Meryl Streep: Known for her incredible talent and versatility, Meryl Streep is one of the most acclaimed and decorated actresses in Hollywood historyactor
    Leonardo DiCaprio: Leonardo DiCaprio is a highly respected actor who has starred in a wide range of critically acclaimed filmsactor
    Viola Davis: Viola Davis is a talented actress known for her powerful performancesactor

    Here are the data you want to classify:

    data
    Bengal Cat: Bengal cats have a wild appearance with rosette-shaped spots on their coat, reminiscent of a leopard.
    Scottish Fold: Scottish Fold cats are recognized by their unique folded ears, which give them an endearing appearance.
    Sphynx Cat: Sphynx cats are a hairless breed with wrinkled skin.
    Shovel: A shovel is a tool with a flat, wide blade and a long handle, used for digging, lifting, and moving soil, gravel, or materials.
    Garden Fork: A garden fork has sturdy tines and a handle, used for loosening soil, breaking up clumps, and mixing in compost.
    Denzel Washington: Denzel Washington is an iconic American actor with a commanding presence on screen.
    Abyssinian: Abyssinian cats are active and playful with a short, ticked coat.
    Rake: Rakes have curved or straight teeth attached to a handle and are used for leveling soil, removing debris, and spreading mulc
    Poodle: Poodles are highly intelligent and come in different sizes: Standard, Miniature, and Toy.
    Dachshund: Dachshunds, or wiener dogs, are known for their long bodies and short legs.
    Yorkshire Terrier: Yorkies are small but spirited dogs with long, silky hair
    Boxer: Boxers are medium to large dogs with strong, muscular bodies.
    Cate Blanchett: Cate Blanchett is an Australian actress known for her elegance and versatility.
    Tom Hanks: Tom Hanks is a beloved American actor known for his likable and relatable on-screen persona.
    Siberian Husky: Huskies are known for their striking appearance, with a thick double coat and blue or multicolored eyes.
    British Shorthair: British Shorthairs are known for their dense, plush coat and round faces.
    Russian Blue: Russian Blue cats have a distinctive bluish-gray coat and striking green eyes.
    Hoe: A hoe has a flat, blade-like head and a long handle, used for weeding, cultivating, and breaking up soil.
    Pruning Shears: Prunin

    Here are some results:

    dataclasssimilarity
    Bengal Cat: Bengal cats have a wild appearance with rosette-shaped spots on their coat, reminiscent of a leopard.cat0.864848
    Scottish Fold: Scottish Fold cats are recognized by their unique folded ears, which give them an endearing appearance.cat0.858967
    Sphynx Cat: Sphynx cats are a hairless breed with wrinkled skin.cat0.858745
    Shovel: A shovel is a tool with a flat, wide blade and a long handle, used for digging, lifting, and moving soil, gravel, or materials.OTHER0.760553
    Garden Fork: A garden fork has sturdy tines and a handle, used for loosening soil, breaking up clumps, and mixing in compost.OTHER0.764859
    Abyssinian: Abyssinian cats are active and playful with a short, ticked coat.cat0.865725
    Rake: Rakes have curved or straight teeth attached to a handle and are used for leveling soil, removing debris, and spreading mulcOTHER0.755273
    Poodle: Poodles are highly intelligent and come in different sizes: Standard, Miniature, and Toy.dog0.843957
    Dachshund: Dachshunds, or wiener dogs, are known for their long bodies and short legs.dog0.845551
    Yorkshire Terrier: Yorkies are small but spirited dogs with long, silky hairdog0.853720
    Boxer: Boxers are medium to large dogs with strong, muscular bodies.dog0.860509
    Denzel Washington: Denzel Washington is an iconic American actor with a commanding presence on screen.actor0.855477
    Cate Blanchett: Cate Blanchett is an Australian actress known for her elegance and versatility.actor0.876928
    Tom Hanks: Tom Hanks is a beloved American actor known for his likable and relatable on-screen persona.actor0.841147
    Siberian Husky: Huskies are known for their striking appearance, with a thick double coat and blue or multicolored eyes.dog0.931415
    British Shorthair: British Shorthairs are known for their dense, plush coat and round faces.cat0.878175
    Russian Blue: Russian Blue cats have a distinctive bluish-gray coat and striking green eyes.cat0.891341
    Hoe: A hoe has a flat, blade-like head and a long handle, used for weeding, cultivating, and breaking up soil.OTHER0.760036
    Pruning Shears: PruninOTHER0.781802

    Try it out!

    HW & SW Prerequisites

    To run EZClassifier, you need any computer with Java 17+ installed. You can download the Java 17+ for your architecture (Linux, macOS, Windows) from the the official Java site.

    Install EZClassifier

    Next, download the latest version of the EZC JAR file to a directory of your choice.

    Lastly, you’ll need an API Key to enable the services (see prices here). Proof of Concept (POC) and free plans are available upon request.

    Create shortcut for launching the java command and set the TC_API_KEY environment with your api key:

    For example, with bash, open a terminal and type:

    export TC_API_KEY=HERE-IS-YOUR-API-KEY
    alias ezc='java -jar ezc.jar' 
    

    For example, in windows open a terminal windows (CMD) and type:

    set TC_API_KEY=HERE-IS-YOUR-API-KEY
    doskey ezc=java -jar "%USERPROFILE%\Downloads\ezc.jar" $*
    

    Test that the system is working:

    # be sure it reports a java version > 17 
    ezc --version
    

    Step 1: download example files

    Here you can download some sample data:

    Step 2: create a model

    Create a new model from your examples:

    ezc model train --name=mymodel --header --input=examples.csv
    

    Be sure to use as –input argument the path (relative or absolute) of the downloaded model source file

    To list all the available models you can use ezc model ls

    Step 3: classify your data using the created model:

    ezc classify --name=mymodel --header --input=input-data.csv --output=result.csv --threshold=0.84
    

    The result.csv file in your current directory will contain your classified data. Note that data with a confidence less than 0.84 are assigned to the “OTHER” category

    Remove your model with ezc model rm --name=mymodel.