Imagine such a scenario, the company's programmer imports the requirements on the project into the AI programming assistant, and the AI programming assistant will quickly generate the corresponding **, including the front-end**, the back-end**, and the SQL statement for operating the database, and the front-end ** is the company's packaged component, the back-end ** is also the company's public class library, and the table in the SQL statement is the company's database. In this way, programmers only need to tidy up these ** to run correctly, and the workload that used to take a month can now be completed in just a few hours, and this productivity will be **ah!
Therefore, software companies should be the first companies to build their own large models, so how to build them?
1. Find an existing open source model first, and train one from scratch, which is too expensive, and some open source models on the market can be used now.
2. Train a large model, prepare the company's data into a dataset for the training of the large model, and evaluate it after training.
3. Deploy a large model, and need corresponding hardware facilities according to the data scale of your own large model.
The focus is on GPUs. Taking the 7B large model as an example, the video memory requirement for inference is 14 GB. If you want to fine-tune, you need at least 140GB of video memory. If it is a 13B large model, the video memory requirement for inference is 26 GB, and the conservative point is 32 GB. If you want to fine-tune, you need at least 260GB of video memory. If it is a 70B large model, it can only be inferred, and it needs 140GB of video memory, and it is not enough to fine-tune the 8-card A100 H100.