"If you want to do a good job, you must first sharpen your tools", here we will build an efficient python development environment to prepare for subsequent data analysis.
When it comes to efficient work, there are two main aspects to the work that requires writing python for data analysis.
011.A development tool with powerful autocomplete and error prompts.
Python's rich library of functions and components is the core reason for the strength of the language, but it is impossible to memorize all the method names and parameter names, and we can only remember the first few letters of a common or a method. This timeA good development tool needs to be able to intelligently "guess" what you want to type, and give a list of candidates for you to choose (similar to the word prompt function of the input method).
In addition,When you make a mistake, this tool can prompt you to make a mistake and suggest what to change, so as to greatly improve the writing efficiency。While others are still checking which word is misspelled and can't run, you've already written a complete module.
022.Master shortcuts.
Python data analysis needs to read the results while writing, and even every two lines you write, you need to click to run, create a new text paragraph, **paragraph and other operations. Therefore, mastering the shortcut keys proficiently can make most of the operations without the need for a mouse, and the hand can be completed without leaving the keyboard, which has the effect of getting twice the result with half the effort.
The whole configuration process is a few more steps than the traditional environment installation, but it is not complicated, you only need to follow the step-by-step operation.
The following version describes how to build the environment: anaconda30
vs code 1.51.1
Actually, there are not many version restrictions, you can just install the latest version.03The first step, the enhanced version of the Python environment for data science: Anaconda
Anaconda is a Python data science toolkit, which contains the most commonly used libraries and tools for Python to do data calculations, which is a must-install software. At the moment it is very mature and the entire set of anaconda is available for free for personal use.
1.Use your browser to access Anaconda's personal page:, click download, and the page will automatically jump to the specific **page:
2.Choose the appropriate version of the installation package based on your device type (Mac Windows). For both Windows and Mac, choose the Graphical Installer, which stands for Graphical Installer and is easier to use.
3.*After that, double-click on the installation package to install it(as shown in the image), click Next.
4.The next step is to use the protocol interfaceand click i agree to the Terms of Use.
5.After successive next, you can see the interface for selecting the installation locationIf there is no special need, just default the position directly, and continue to click next.
6.The last configuration interface is the advanced options, no need to change, just click install, wait for 2 or 3 minutes, and then the installation can be completed.
After installation, you can find Anaconda N**igator from the program, click to open it, and you can see all the tools of Anaconda3 (as shown in the figure below).
Notebook is the most widely used tool for data analysis, but it's not efficient enough because it lacks intelligent input associations, autocomplete, and error prompts. And an effective analyst will not tolerate writing with a "notepad".
So, next, we can configure a smart and powerful notebook in our own computer (at this time, the installed anaconda3 page will not be closed for now).
04 The second step, fly the general **Editor: VS code
VS Code (Visual Studio Code) is a cross-platform editor developed by Microsoft, and has become the most popular editor in the world by virtue of its powerful plug-in ecosystem. This time, we will solve the problem of notebook development efficiency through VS code.
First, follow the steps below to install and configure VS Code.
Accessing the web page with a browser will directly identify the current operating system, and directly click the **button,**install the package.
2.Installation:**After completion, double-click the installation package to install, and all the default configurations can be used.
3.Install the Chinese language pack[Optional, students who are accustomed to English can skip it]: Start VS code, enter the plugin tab (the icon at the bottom of the left sidebar), enter [Chinese], the first plugin that appears, click install to install. Once the installation is complete, restart VS Code to take effect.
4.Install the python plugin: Still in the plugin panel, enter [python], and install the first plugin in the list.
At this point, the basic VS code environment has been configured.
05The third step is to configure the Python environment of VS Code to use Anaconda.
Open VS Code, select [File] - [New File], a default text file will be created, press Ctrl +S to save, and the file name is [Hello.].py】。
The suffix must be .py, because VS code matches the appropriate toolchain based on the file's suffix.After saving, if VS Code recognizes the Python file, the python plugin we installed in the previous step will start working, looking for the native python environment, and the result will be displayed on the status bar below.
Anaconda's Python environment contains a rich library of scientific calculations, so it is the first choice for data analysis.Once we've confirmed our environment, we're ready to move on to the final step.
06 Step 4, Jupyter in VS Code
Let's go to the VS Code plugins tab (the icon at the bottom of the left sidebar) and type Jupyter to install the Jupyter plugins officially produced by Microsoft (the first few have the word Microsoft).
After the installation is complete, restart VS code (if it is disabled, it is installed, and you can directly follow the follow-up). Press [Ctrl+P] to pop up the command panel, enter [>jupyter], then all the operations supported by the jupyter plugin will be listed, select [jupyter: create new blank jupyter notebook], as shown in the following figure.
After selection, a notebook-like editing interface appears inside VS Code, which is different from the traditional web version of notebook, and the notebook in VS Code has powerful hints and autocomplete functions. Next, let's learn its main operations.
Open the editing interface, and divide the notebook operability area into three parts: the main operation area, the cell operation area, and the sidebar operation area.
Main action area: It is mainly used to control some behaviors of the entire notebook. (You can put the mouse over the icon to see the corresponding function of each button).
Sidebar action area: The "+" sign in different positions indicates that the cell is inserted in different positions.
cell manipulation area: It is mainly used to control the behavior of the current cell.
Cell is the core concept in notebook, literally translated as "cell", but the cell in notebook cannot be simply summarized by cells, so this article is uniformly described by cell, and a notebook is composed of multiple cells. There are two types of cells:
*cell, mainly used to write python **Each cell can be executed separately, and the execution result will be displayed below the cell.
Text cell, as the name suggests, is used to write text, For data analysis, in addition to the ** itself, the idea of analysis and the logic of derivation are also very important, and the text cell is used to carry these contents.
This is also the biggest difference between notebook and ipython, which can realize the mixing of ** and text to maximize the output of data analysis.
07Basic operations of notebook.
Next, let's learn the basic operations of notebooks with a specific purpose. These operations will be used frequently in subsequent blog posts, so let's get acquainted with them through a few simple examples.
1. Create a notebook and save it as My Practiceipynb。
2. Add a cell, print "this is my first notebook" through **, and run. In the following cases, we will write a new cell to test the content of our experiments at each small stage.
3. Add a cell, convert it to text cell, and enter the text "My data analysis has started!".”。
4. Add a cell and print the result of 1+1 by **.
Let's start with the above example:
The first step is to press [Ctrl + P] ([Cmd + P] for Mac), bring up the command panel of VS Code, and enter [> Jupyter] to see the commands supported by the notebook plug-in, among which the more commonly used ones are as follows.
Create New Black Jupyter Notebook: Creates a new blank notebook workspace. Export to PDF: Export the current notebook to PDF, which will be used when writing data analysis reports in the future. import jupyter notebook: import an existing notebook. This is used to import existing notebook files.First, select the first one, create a new notebook, press [Ctrl + S] to save it, and enter the file name: firstipynb。
The second step is to create a new cell, we can click the + sign in the sidebar operation area to create a new cell, and then we enter the following **:
In the third step, we will first create a new cell, click the m icon in the cell operation area, switch to text mode, and enter "My data analysis has started!".”。After the input is completed, click any area other than the cell to exit the editing mode and enter the preview mode (double-click the cell to re-enter the editing mode). In this way, our third step is complete. As shown in Fig.
The fourth step is very simple, let's create a new cell directly, and enter the following **:
print(1+1)
Running the cell, you can see that "2" is printed, and at this point, our task is complete. The whole process is shown in the figure.
At this point, you have configured a set of python development environment for data analysis on your computer, and you also know how to create a new notebook, and add ** cell to input ** and text cell to enter text in the notebook.
Previous topic: Why Python?