Biu understands CV training data input based on AI Pytorch architecture

Mondo Technology Updated on 2024-02-18

BUI's new series of blog posts will focus on AI-related fields, if you want to learn about Qualcomm Bluetooth, check out the previous series or follow big bloggersSonic waves and radio waves look at the present day

1.If the model has requirements for input data, it will pass in the transforms used to convert the data, including the target transform of the tags (all methods can be found in the transforms package), which will be used in the dataloader to call the getitem function. In addition, some other setting parameters will be passed in, but they will be saved locally for later use. The main purpose of this function is to organize the data, and the data and labels correspond to each other to form a list, which is convenient to obtain, and it is also convenient for the len function to count the number.

2.For example, data and tags are stored in different folders, and we can use their paths to team up and put them in the list. getItem, extract the content according to the path; For example, if the tag is only stored in a file, we need to read the content inside and map it to the corresponding data file. In other words, the data file name is not the same as the label file name, and we need to rely on the third index table to find the correspondence, and we also need to read it out to match them.

3.In general, the training data is organized into training samples for getitem to quickly extract and speed up the training time.

4.Get len of the total number of datasets: Get the number of samples available.

5.and getitem to get a specified index sample: find the corresponding sample according to the index, and if there is a transform, you need to convert the data or tags to the specified data format first. For example, resize to ** of the same size; or normalize the data; Or convert to tensor format. Tags may be converted into data formats, coordinate formats, one-hot encoding, and so on. These transformations are designed to meet model input requirements or to provide model performance.

In addition to the basic functions mentioned above, some derivative frameworks will add a lot of tuning functions, such as adding more augmentation data if there are not enough training samples; Put data or labels in RAM to provide training data; Discard data and labels if they don't have a pair; Only a few of these types are used as sample data, etc., and these personalized adjustment functions.

The role of the dataloader is to manage the loading of data and provide input to the data format required by the model. To speed up the training, DataLoader packages multiple data samples into a batch (equivalent to merging data and labels into corresponding large tensors), and sends multiple data samples into the model for training at one time. (Packing here is stacking multiple tensors on top of each other, and getting them is done by using an iterator to get them one by one.) However, if the tensors of the samples are inconsistent, for example, the label of the object detect has multiple outputs, then the direct stacking error will occur. Therefore, DataLoader provides the collate fn function to customize the packaging function, but pay attention to the stacked tensors that meet the input requirements of the model, and be able to return the required sample data and labels in the DataLoader's iterator every time).

The number of packages should not be set too small, otherwise it will be difficult for the model to converge. However, it should not be too large, the number of each package depends on the memory capacity, if it is too large, it may not be able to hold the memory in the middle of training. (Tips: GPU's batch of powers of 2 can play a better performance, so it is better to set it to equal numbers, but most datasets cannot be divisible by these numbers, so there are often data left that are not enough to make up a batch, so you can choose not to use the remaining data).

In addition, if the data is sorted by category, the direct sequence will also affect the convergence speed of the model, so DataLoader will provide the function of scrambling the data. In addition to these basic functions, there are other ways to speed up training, such as putting data in page storage, speeding up reads, running dataloaders in multiple processes, etc., all of which can be directly configured with dataloaders, provided that there are sufficient resources.

The attachment (see Datong) uploads a custom dataset and dataloader usage scenario routine, which can be referred to if necessary. The above is all the content of this blog post, if you have any questions, please leave a comment below the blog post, I will answer (o o) as soon as possible. Thank you for browsing and we'll see you next time.

Simplicity is the result of long-term effort, not the starting point- Not meFAQ 1: Can keras use this method?

A1: No, Keras is based on the TensorFlow framework.

FAQ 2: What are the label formats?

A2: See the previous blog post - BIU understands the label format of ai:object detection training data.

FAQ 3: Can I get the data of the dataloader and display it?

A3: Yes, but if there is a transform, it will be converted into the corresponding tensor, then the data cannot be displayed directly, and it needs to be inversely converted.

FAQ 4: Should train and val use the same dataloader?

A4: It can't be said that it is exactly the same, because the tasks of train and val are different, and val does not need to consider issues such as model performance, so when transforming data, you don't need to consider data augmentation.

FAQ 5: Is it OK to use image data in a different format?

A5: Yes, but you need to transform, just change it to RBG format.

Log in to Datong, ** attached technical documents, to learn more about CV training data input related to PyTorch architecture.

Related Pages