PES Sound Forensics: Report - What It Means To Train A Model

Training a model is the process of getting a machine to "learn" the data by supplying an adequate training set that consists of relevant data such that the model will now be able to recognize new instances of the same type of data which did not belong to the supplied training data set. This is done by pairing the input data with the expected output. For example, if we must train a model to recognize human speech of the English language, we must supply the model with a training data set with a large number of people speaking English. With the help of this training set, the machine must be taught to learn that the samples are of human speech by associating them all with some common patterns in their feature vectors (for example, MFCCs). Once a model is believed to be trained, it must be tested with a test data. In the context of the aforementioned example, this could consist of a few more samples of human speech that are not present in the training data set.

Training a model can be accomplished by using a supervised algorithm, an unsupervised algorithm or a semi-supervised algorithm. A supervised algorithm helps a machine to infer from the data. The data provided in the training set is labelled. An unsupervised algorithm on the other hand enables the machine to learn the data on its own by finding a hidden pattern or organization in the data (the data is unlabelled). In semi-supervised learning, both, labelled and unlabelled data is used. Commonly, a small amount of labelled data is used along with a large amount of unlabelled data, where the labelled data is made use of to understand or learn the structure of the unlabelled data.

The training phase is extremely important, since the datasets used in the process will largely influence the machine’s ability to learn, and the machine’s performance (measured in terms of how many test cases are identified correctly or erroneously) and efficiency (in terms of speed and energy utilization). The datasets must be large enough and in the case of speech and sound applications, diverse enough. Most often, the richer the dataset used in the training phase, the more accurate the results.

PES Sound Forensics

Friday, 17 April 2015

Report - What It Means To Train A Model

No comments:

Post a Comment