The U.S. intends to ban data flows to countries such as China

Mondo Technology Updated on 2024-03-08

On February 28, the White House released the latest news, **Biden officially signed the "Executive Order on Preventing Countries of Concern from Obtaining a Large Amount of Sensitive Personal Data of U.S. Citizens and Related Data of U.S.-**", restricting China, Russia, Iran and other six "countries of concern" and entities related to them from accessing and using "sensitive data" of U.S. citizens and **, including genomic data, biometric data, personal health data, geolocation data, financial data and specific types of personal identification information. and sensitive data related to **. It is reported that this is the first time in the history of the United States that cross-border data transfer to China has been banned. In particular, among the "sensitive data" cases cited by Biden, the biotechnology field can be called the "hardest hit area".

In fact, as early as 2020, our Dr. Tao made assumptions about this problem, and when I look at it again today, I still feel that it makes sense. The article not only elaborates on the specific stuck points of NCBI, but also introduces the domestic alternative platform, and now publishing this article again may give you more inspiration.

Point of view. (1) Habitually used and irreplaceable NCBI is the sword of Damocles on the heads of non-American molecular biology researchers.

2) Being "stuck in the neck" is not only technology, but also cognitive methods.

3) We should not only look forward to major national projects, but also strive to save ourselves.

4) A good bioinformatics analysis platform is for biologists, not analysts.

What is the "stuck neck" technique.

Since 2018, people of insight in China have spoken out through various channels, indicating that there are many things that are not powerful in "my country". Among them, the voice of Liu Yadong, editor-in-chief of science and technology, is the most famous, and you can refer to the following for details

1.What are the technologies that China is stuck in?

2.At the end of 2019, the Prime Minister presided over the National Science Symposium for Distinguished Young Scientists.

It can be seen that the "stuck neck" technology has several characteristics as follows

1) Technology has the characteristics of long-term accumulation, which cannot be completed overnight.

2) There is multidisciplinary permeability, which requires not only very advanced special professional capabilities, but also needs to be developed in many supporting fields such as manufacturing and maintenance.

3) The monopoly of industry "cognition", there may be similar technologies or methods in China, but users cannot pay for them.

In my opinion, the well-known NCBI** in the field of bioinformatics is actually a "stuck neck" technology. I don't know where it started.,NCBI is often convulsed when used in the mainland.,The research progress of students during the 404 period.。。。

But there is no such thing as an official release of technology, and the "database management system" is the closest option - it definitely underestimates the importance of this platform, and it is heartbreaking!

Analysis of the advantages of NCBI.

NCBI is an important data resource for all kinds of biological researchers, I will not repeat the function, if you are not clear, you can see: NCBI function introduction. It has been 15 years since I first used NCBI in 2005, and I have witnessed various revisions and upgrades, as well as being incorporated into Google. And because of the nature of the work, I have used almost all the functions, so I should be considered a senior user.

Analysis of core advantages——

NCBI's international top data analysis capabilities & data integration management capabilities are unexpectedly unbelievable - is it particularly similar to what we use!

1. Practicability

1) Literature search: In today's world where Google Scholar is not available, PubMed is a long-lost ray of sunshine on a rainy day.

2) BLAST: I think as long as you are a researcher with a regular academic background and give you a DNA sequence, and you want to know what it is, the first choice is always to log in to ncbi-blast to find the answer.

3) As the most detailed platforms for submitting tutorials, GenBank, SRA, and GEO are still the first choices for uploading raw data.

2. Authority

1) Rigorous review mechanism: in addition to the constantly escalating and desperate machine calibration process (which is getting smarter and smarter, it is almost doubtful that alphago is being applied to the life sciences); Finally, there is a manual check, for each sequence submitted to Genbank, there are very senior sequence experts to check, I believe: as long as the Sanger sequence, small genome and other people have been submitted, it will leave an indelible impression.

2) Frequent updates and modifications: The previous textbook said that the default of NCBI is to update the database every 2 months, but in recent years, this frequency has definitely increased. In addition to updating, existing data are also frequently revised, especially for controversial species.

3) The pearl in the field of tackling key biocommunications: sequence comparison.

If someone does a few major problems in the field of bioinformatics, then the sequence comparison can definitely be ranked in the top ten. And NCBI has been updating BLAST for decades, and performance and accuracy don't care, and this spirit alone has already made people fall to the ground

There is nothing to say about the professionalism of the NCBI staff, niubility!

3. "Big Data" companies

If NCBI is a pure commercial company, is it very much in line with the "data company" that Jack Ma's father often nags. The scary thing is that this company has been accumulating data for a much, much longer time than Alibaba

Domestic alternative platforms.

Of course, NCBI cannot monopolize it, in addition to EMBL and DDBJ internationally, there are similar platforms in China, the most famous of which are CNDB (and NGDC () of the Chinese Academy of Sciences

1. CNDB of the National Gene Bank

CNDB, which is deeply involved by BGI's technology team, is undoubtedly the closest platform to NCBI, and it may think further and plan better in many aspects. For example, in addition to the storage and inclusion of genetic resources, there is also a writing function (synthetic biology).

Powerful, there are currently more and more domestic scientists uploading their raw data to CNDB. However, after several years of promotion, only a few "elite" scientists participated, and the number of ordinary users who know that the database and ** are still very rare.

I think it hasHigh coldOnly people in the bio-information circle will agree, from its simplest blast function can be seen, compared with the NCBI and CNDB BLAST interface:

Which is good for a biology researcher?

The answer is obvious!

Regardless of whether NCBI has the advantage of upfront customer education, the CNDB interface is more suitable for users of NGS large-scale data analysis, and such users are still a minority.

Personal understanding -

1.This feature is the most frequently used tool on a daily basis and needs to be more in line with the user experience.

2.It shouldn't be too difficult, after all, you don't need to write a blast comparison software from scratch.

The biggest reason for this unfriendly use is that the project leader is a "high-throughput data analyst" who has only worked as an NGS, not a biological researcher. The former is limited by the understanding of business workflow, and makes a "generation sequencing" product with the idea of cloud computing and cloud platform.

Of course, there is another reason for the current state -

Similar functions are completely public welfare, NCBI is funded by the United States**, and CNDB still has the pressure of self-financing, so it is deliberately not close to the people to reduce operating costs.

2. Chinese Academy of Sciences-National Genome Database NGDC

I personally use this database very little, see the interface is very similar to ncbi, the functional network is very large, if you are interested, you can try to play. I don't have any specific ideas ...

Enemies of domestic platforms.

People in the biological industry know the importance of NCBI to biological research, not to mention experts and leaders, whose foresight and wisdom do not need to be questioned by mortals such as me. Now and in the near future, the confrontation between China and the United States in various industries will continue to escalate, and it is obvious that it is not only in the field of chips. When NCBI bans Chinese IP addresses one day, graduate students must be in the following states...

So, since there is a domestic platform, why is it so difficult to promote? In order of priority, from high to low, I list the enemies of domestic platforms -

1) Users. That's right, users are the biggest enemy right now!

Chinese people need to believe in domestic production, if you don't believe in other Chinese products, then make a professional platform yourself, use it yourself!

There are more people using domestic platforms, and NCBI naturally can't get stuck!

2) Time. Today, Trump** has severed contact with the WHO, and at the same time limited the number of Chinese students, and may go back to blocking databases such as NCBI tomorrow, right?! So we don't have much time left.

3) The investment of professional talents.

Although the domestic substitution of the two platforms has done a very good job, the company or platform behind them is not a full-time database team, and their own long-term planning such as topics and KPIs can allocate energy and team resources to carry out this project continuously - it is unknown.

Therefore,Instead of pinning your hopes on someone else's platform, it's better to make your own!

4) Hardware investment.

The above is all software, and the requirements for hardware and capital for biological databases are also very high, otherwise NCBI does not need to rely on Google.

If centrally managed national biological databases are more difficult, thenDoing a good job of a database of a single species, or a specific field is the best choice to save the country!

Note: The above content represents a personal opinion only!

Whether it is the cornerstone of basic science (data, databases) or biotechnology, it is becoming more and more obvious that it will become the focus of the next round of disputes between China and the United States, how should we respond in the future? This is a brand new challenge for our scientific researchers.

Related Pages