In this post I show basic knowledge and notes for data science beginners. You will find in this post an link to Jupyter file with code and execution.
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
Use the following import convention:
import pandas as pd
Here I continue the content of the previous post Data Science in Python: Pandas Introduction
This post I consider three sources: CSV, XLSX and SQL Query
Read and Write CSV
pd.read_csv('origin-file.csv', header=None, nrows=5) pd.to_csv('destin-file.csv')
Read and Write Excel
pd.read_excel('origin-sheet.xlsx') pd.to_excel('destin-sheet.xlsx', sheet_name='Sheet1')
Read and Write to SQL Query or Database Table
from sqlahchemy import create_engine engine = create_engine('sqlite:///:memory:') pd.read_sql('SELECT * FROM my_table;', engine) pd.read_sql_table('my_table', engine) pd.read_sql_query('SELECT * FROM my_table;', engine)
Pandas is flexible and easy to use analysis and manipulation data with external sources.