A person hand rubbing an AI Pin? The time has come for Sam Altman to create a unicorn company single

Mondo Psychological Updated on 2024-02-09

If there is a device that is less than the size of the palm of your hand, you can use this device to record the sounds around you anytime, anywhere, and can convert them into text to communicate with large language models, then would you consider getting one? Then if I tell you again, you can even make such a device by hand yourself, and the cost is not even 100 dollars.

Yes, it's equivalent to rubbing an AI PIN by hand.

Adam C., CEO of Cado, the UK's first forensic investigation platformh.Released a ** that tells that he only used a Coral AI micro development board and the optional Bluetooth module of the development board to make a voice collector, and Adam called this device "ADEUS". The word means goodbye in Spanish, and in this device, it means "goodbye to the network and regulation", referring to the inability of Internet companies to collect users' personal privacy through devices.

As you can see from the figure above, the board contains a camera and a microphone, and the MCU (microcontroller) is called NXP IThe MX RT1176 is based on the ARM architecture and uses two processors, Cortex-M4 and Cortex-M7. Frankly speaking, both of these processors are low-end processors in the Cortex series and don't provide much computing power.

Speaking of this, you may think "Ah, isn't this nonsense, this broken MCU is enough for something". Saying this, it means that you are asking the point, let's focus on the chip that looks significantly different from other chips, engraved with the big "Coral" logo. This is a CORAL AI EDGE TPU coprocessor that provides 4 tops (data structure is INT8) of computing power. TPU is a concept proposed by Google, the full name is tensor processing unit, which is specially designed for deep learning and machine learning tasks.

Coral AI Edge TPU is not the other TPU, it is called "Edge TPU", which means edge TPU. Its compatibility and performance are far inferior to TPUs, but it has low power consumption and small size. Of course, each neural network model has different performance requirements, faced with something like Adam Ch.This open-source model installed in Adeus generally does not perform too badly.

Then the rest is easy, Adam Ch.I found an open-source AI voice-to-text software on the Internet, connected ADEUS to the computer, and finally performed the installation, and everything was done. If you want, you can install open-source AI software for the development board's camera, such as identifying faces, objects, and so on. You should understand when you see this, the logic of making electronic products now is that the whole process relies on artificial intelligence, and all components serve artificial intelligence, as long as the hardware computing power is in place, the function can be realized in the end.

If we don't use artificial intelligence technology, it is very difficult to restore the process of "recording sound and converting text". The first thing you need is a module that can pick up sound, usually a microphone. However, the sound captured by the microphone is an analog signal, so the analog signal to be captured may need to go through some pre-processing, such as filtering, amplification, etc., to ensure quality and adaptability, and a chip is required for each step.

Here comes the most important thing, which converts the analog signal into a digital signal so that the chip can do digital signal processing. The next step is to process the digital signal, such as noise reduction and feature extraction, in preparation for input to the speech recognition engine. After passing these digital signals through the speech recognition engine, the transcribed text needs to be output to a suitable storage device or sent through a communication interface.

If you compare it, you will find that "it turns out that artificial intelligence saves so many things!" ”

To tell you the truth, $100 is still a bit too high. So Adam Ch.In the future, the Raspberry Pi Zero will be used to make the Adeus.

The Raspberry Pi Zero is no exception, and Ethan Sutin, CTO of the chat app Squad, has a similar idea, but what he wants is to communicate with large language models anytime, anywhere. So he used Apple's M1 chip, combined with OpenAI's Whisper technology, to make a Chat GPT3 that can be "carried in his pocket".5。

Apple's M1 chip and microphone array Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Whisper's theory is based on "Robust Speech Recognition via Large-Scale Weak Supervision" by Alec Radford et al. from OpenAI. Trained on nearly 700,000 hours of labeled data, the Whisper model demonstrates its ability to effectively generalize without fine-tuning across many datasets and domains.

There is no switch on this device, so how to activate Whisper also needs the help of artificial intelligence. Ethan uses Silero, which is a Sound Activity Detection (VAD), and chose it for nothing else, mainly because of the model used by Silero, the JIT, which only requires 1MB bytes in size, and the most lacking thing in portable devices is capacity.

After understanding these two keys, you will find that Ethan's approach is better than that of Adam Ch.Simple, the device uses silero to identify whether there is sound coming to the microphone, and then uses the whisper model to transcribe the sound into text. Through the mobile phone, the transcribed text is input into the large language model, and finally the feedback of the large language model is obtained, so as to realize the communication with the large language model anytime and anywhere. So in essence, he is also using artificial intelligence to make hardware. Apple's M1 chip** costs about $40, in other words, it's a lot cheaper than Coral AI's.

Sam Altman, CEO of Apple's M1 chip OpenAI, said that there is now a company with a market value of $1 billion with only one employee, and its core competitiveness is artificial intelligence.

In the future, especially in the field of smart wearables, it is very likely to become a kind of "what function you need, how many computing resources are prepared". For example, the reason why they chose the Raspberry Pi and Apple M1 chips for the two devices mentioned above is that the memory, video memory, and computing power provided by these two meet the needs. In general, the GPU's memory is mainly used to store model parameters, calculate intermediate results, and perform related operations on model optimization. The memory of the system is mainly used to store training data, model parameters, and some runtime data. When training large deep learning models, it is important to ensure that the system memory and video memory are large enough to accommodate the data and model parameters.

Raspberry Pi We can simply abbreviate this trend of hardware into one sentence: Tao begets one, one begets two, two begets three, and three begets everything. The essence of these great inventors is not how exquisite the craftsmanship they have, but how skillfully they have integrated artificial intelligence into hardware products. In the future, with the continuous advancement and innovation of technology, we are expected to usher in an era where the production cost of smart devices will be significantly reduced. At that time, all kinds of advanced sensors, microprocessors and artificial intelligence components will become more accessible and accessible, so that handicraft enthusiasts and even the general public can make their own feature-rich smart hardware products at a relatively low cost. With the support of the open source community and the development of the sharing economy, the software resources and technical tutorials needed to make smart devices will also become accessible, further lowering the barrier to entry.

Related Pages