Python 自修心得

0. 個人心得

1. Learning Python by Mark Lutz

2. Mastering Python by Rick van Hattem

3. Data Science from Scratch

- First Principles with Python by Joel Grus 自修心得

4/5/2018

about me

EE degree back in 1982

Z80 was the most popular CPU

Pascal/Fortran/COBOL were popular languages

Apple ][ + BASIC and CP/M

intel 80386SX PC mother board designer

......

Interested in Linux since 2016

Z80 CPU

intel 80386SX CPU

photo source: wikipedia.org

Apple ][

marconi.jiang@gmail.com

Vocaburary

Text

Python 自修心得

0. 系統

1. 其它

4/28/2018

系統定義函數 inbuilt functions

dir() 查詢當前系統已經定義的名稱
- ['In', 'Out', '_', '__', '___', '__builtin__',
  '__builtins__', '__doc__', '__loader__',
  '__name__', '__package__', '__spec__',
  '_dh', '_i', '_i1', '_ih', '_ii', '_iii', '_oh',
  '_sh', 'exit', 'get_ipython', 'quit']
dir(__builtin__) 列出系統內置模組
- ['ArithmeticError',
  'AssertionError',
  'AttributeError', ....]
id(x), id(7)
- id() is an inbuilt function in Python. Syntax: id(object)
- The id() function returns a unique id for the specified object. All objects in Python has its own unique id. The id is assigned to the object when it is created.
- As we can see the function accepts a single parameter and is used to return the identity of an object. This identity has to be unique and constant for this object during the lifetime.
- The id is the object's memory address, and will be different for each time you run the program. (except for some object that has a constant unique id, like integers from -5 to 256)
- Two objects with non-overlapping lifetimes may have the same id() value.

系統定義函數 inbuilt functions

type(a) : 可觀察物件的類型

可變/不可變 mutable/immutable

Mutable
- List,
Immutable
- String, Tuple

Python 自修心得

0. 其它

安裝 Anaconda

Anaconda

How to change the Jupyter start-up folder

Open cmd (or Anaconda Prompt) and run

This writes a file to

Search for the following line:

Replace and un-remark

Make sure you use forward slashes in your path and use /home/user/ instead of ~/ for your home directory

$ jupyter notebook --generate-config

c.NotebookApp.notebook_dir = '/Volumes/HDD160G/Dropbox/'

#c.NotebookApp.notebook_dir = ''

~/.jupyter/jupyter_notebook_config.py

Python 自修心得

0. 其它

關於 import

About import

Text

參考資料： The Definitive Guide to Python import Statements

Numpy - import 方式

先看一下不同的 import 方式, 造成不同的引用方式

容易出錯的引用

參考資料：Python Numpy的数组array和矩阵matrix

import numpy
>>> a = numpy.array([1, 2,3,4])

import numpy as np
>>> a = np.array([1, 2,3,4])

from numpy import *
>>> a = array([1, 2,3,4])
>>> a
array([1, 2, 3, 4])

>>> a.dtype
dtype('int32')

>>> a = array(1,2,3,4)    # WRONG

>>> a = array([1,2,3,4])  # RIGHT

Python 自修心得

0. 其它

關於 Data Types

Python Data Types

Numeric Type 數值類型

int 整數 -2**31 ~ 2**31; 若超過範圍, 會自動在字尾加上 L, 標示為長整數, 實際範圍幾乎沒限制, 取決於記憶體大小
- 0b/0B : binary
- 0o/0O : Octal
- 0x/0X : Hexadecimal
long (floating?)
bool
complex

String Type 字串類型

String

Container Type 容器類型

list 串列 - [ ]
set 集合 - { , }
- 空集合只能使用 set() 函數建立, 不能用 {}, {} 表示的是空的 dict
- 如果要利用字串中的個別字元或串列中的元素, 同樣必須使用 set(), 而不是使用 {}
dict 字典 - { : , : } mapping/映射
tuple 元組- ( )
Bytes(?)

參考資料： Book - Python 與量化投資

>>> S0 = {}
>>> S1 = set()

>>> type(S0)
<class 'dict'>

>>> type(S1)
<class 'set'>

>>> set('Quant')
{'a', 'Q', 'u', 'n', 't')

>>> type(set('Quant'))
<class 'set'>

>>> {'Quant'}
{'Quant'}

>>> len({'Quant'})
1

>>> type({'Quant'})
<class 'set'>

Python Data Types – Learn From Basic To Advanced

Booleans
Numbers
Strings
Bytes
Lists
Tuples
Sets
Dictionaries

參考資料： Python Data Types – Learn From Basic To Advanced

關於 python 的屬性

Python 支援的資料結構有
- 基本：tuple, generator, list, array (?), series (?), set, dictionary
  - tuple - (1,2,3,4)
  - generator - (ord(x) for x in 'spaam')
  - list - [ord(x) for x in 'spaam']
  - set - {ord(x) for x in 'spaam'}
  - dictionary - {x: ord(x) for x in 'spaam'}
- 第三方常用：DataFrame, np.array
Pandas DataFrame
- df.info()
- df.columns
- df.keys()
Object 又包含了什麼資料結構？

參考資料：淺談 Python 的屬性

關於 python 的屬性 - Tuple / Generator

Parentheses are used for three different things: grouping, tuple literals, an function calls.
Compare (1 + 2) (an integer) and (1, 2) (a tuple).
In the generator assignment, the parentheses are for grouping;
in the tuple assignment, the parentheses are a tuple literal.
Parentheses represent a tuple literal when they contain a comma and are not used for a function call.
This works since there is no way (1,2,3,4) could be a generator. There is nothing to generate there, you just specified all the elements, not a rule to obtain them.
In order for your generator to be a tuple, the expression (i for i in sample_list) would have to be a tuple comprehension. There is no way to have tuple comprehensions, since comprehensions require a mutable data type.
Iterating over the generator expression or the list comprehension will do the same thing. However, the list comprehension will create the entire list in memory first while the generator expression will create the items on the fly, so you are able to use it for very large (and also infinite!) sequences.

參考資料： Python tuple vs generator、Generator Expressions vs. List Comprehension

關於 python 的屬性 - Tuple / List (1/2)

Difference between list and tuple

Literal
```
someTuple = (1,2)
someList  = [1,2] 
```

Size

a = tuple(range(1000))  # 如果是 generator, 更省 memory
b = list(range(1000))   # c = (i for i in range(1000))

a.__sizeof__() # 8024   # c.__sizeof__() # 64
b.__sizeof__() # 9088

Due to the smaller size of a tuple operation, it becomes a bit faster, but not that much to mention about until you have a huge number of elements.

Usage

As a list is mutable, it can't be used as a key in a dictionary, whereas a tuple can be used.

a    = (1,2)
b    = [1,2] 

c = {a: 1}     # OK
c = {b: 1}     # Error

參考資料： What's the difference between lists and tuples?

關於 python 的屬性 - Tuple / List (2)

Permitted operations

b    = [1,2]   
b[0] = 3       # [3, 2]

a    = (1,2)
a[0] = 3       # Error

That also means that you can't delete an element or sort a tuple. However, you could add new element to both list and tuple with the only difference that you will change id of the tuple by adding element

a     = (1,2)
b     = [1,2]  

id(a)          # 140230916716520
id(b)          # 748527696

a   += (3,)    # (1, 2, 3)
b   += [3]     # [1, 2, 3]

id(a)          # 140230916878160
id(b)          # 748527696

List 與 Tuple 的特性與操作 methods 比較

list 與 tuple 的 methods 之差異，例如 list 可以 sort 排序，而 tuple 就不能 sort 排序

List其特性有:
List is a collection which is ordered and changeable. Allows duplicate members.
Tuple因其特性有:
Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
是ordered順序性與unchangeable(不可改變)的特性。

參考資料：Python 源碼解析之 list 與 tuple 的 特性與操作 methods 比較

Python 自修心得

0. 其它

關於 Numpy 及 array

Numpy - import 方式

先看一下不同的 import 方式, 造成不同的引用方式

容易出錯的引用

參考資料：Python Numpy的数组array和矩阵matrix

import numpy
>>> a = numpy.array([1, 2,3,4])

import numpy as np
>>> a = np.array([1, 2,3,4])

from numpy import *
>>> a = array([1, 2,3,4])
>>> a
array([1, 2, 3, 4])

>>> a.dtype
dtype('int32')

>>> a = array(1,2,3,4)    # WRONG

>>> a = array([1,2,3,4])  # RIGHT

Numpy - array 矩陣

当你列印一个数组，NumPy以类似嵌套列表的形式显示它，但是呈以下布局：

最后的轴从左到右打印
次后的轴从顶向下打印
剩下的轴从顶向下打印，每个切片通过一个空行与下一个隔开

>>> c = arange(24).reshape(2,3,4)         # 3d array
>>> print(c)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]] 

>>> c
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

(2, 3, 4) = (z, y, x) 從

z =0

x [ 0 - 3 ]

(y=0) y*4 [a0, +1, +2, +3]

(y=1) y*4 [a4+0, +1, +2, +3]

(y=2) y*4 [a8+0, +1, +2, +3]

z =1 (starting from y * x )

x [ 0 - 3 ]

(y=0) y*4 [a12, +1, +2, +3]

(y=1) y*4 [a12+4+0, +1, +2, +3]

(y=2) y*4 [a12+8+0, +1, +2, +3

print 指令與 interactive 下的 array 不同表達方式

Numpy - array 矩陣

另一個例子

一階的 array ( [2, 3] ) 的 shape 是 (2,)
二階的 array 的 shape 是 (1, 2) 時, array 成為 ( [ [2,3] ] ). 因為是二階, 所以有 2 個 [ ], 因為, 當 shape 是 (2, 2) 時, array 就成為

( [ [0, 1],

[2, 3] ] )

In [1]: input_data =np.array([2,3])
In [2]: input_data.reshape(1,2).shape
Out[2]: (1, 2)

In [3]: input_data.reshape(1,2)
Out[3]: array([[2, 3]])

In [4]: input_data.shape
Out[4]: (2,)

In [5]: input_data
Out[5]: array([2, 3])

In [6]: input_data * weights['node_1']
Out[6]: array([-2,  3])

Numpy - object 屬性 query

其中的 dtype 與 itemsize 隨 OS 有不同設定而有不同結果, 以我的例子結果分別是 'int64' 與 8

>>> from numpy as np
>>> a = np.arange(15).reshape(3, 5)
>>> a
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
>>> a.shape
(3, 5)
>>> a.ndim
2
>>> a.dtype.name
'int32'
>>> a.itemsize
4
>>> a.size
15
>>> type(a)
numpy.ndarray

Numpy - object 屬性設定

複數

其它函数array, zeros, zeros_like, ones, ones_like, empty, empty_like, arange, linspace, rand, randn, fromfunction, fromfile参考：NumPy示例

>>> c = np.array([[1,2], [3,4]], dtype=complex)
>>> c
array([[ 1.+0.j,  2.+0.j],
       [ 3.+0.j,  4.+0.j]])

>>> np.zeros( (3,4) )
array([[0.,  0.,  0.,  0.],
       [0.,  0.,  0.,  0.],
       [0.,  0.,  0.,  0.]])
>>> np.ones( (2,3,4), dtype=int16 )       # dtype can also be specified
array([[[ 1, 1, 1, 1],
        [ 1, 1, 1, 1],
        [ 1, 1, 1, 1]],
       [[ 1, 1, 1, 1],
        [ 1, 1, 1, 1],
        [ 1, 1, 1, 1]]], dtype=int16)
>>> np.empty( (2,3) )
array([[1.39069238e-309, 1.39069238e-309, 1.39069238e-309],
       [1.39069238e-309, 1.39069238e-309, 1.39069238e-309]])
>>> np.linspace(0, pi, 3)
array([0.        , 1.57079633, 3.14159265])

函数 zeros 创建一个全是0的数组，函数ones创建一个全1的数组，函数empty创建一个内容随机并且依赖与内存状态的数组。默认创建的数组类型(dtype)都是float64。

Numpy - 計算/統計

sum()、min()、max()

指定axis参数你可以吧运算应用到数组指定的轴上：

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> a.sum()
66
>>> a.min()
0
>>> a.max()
11

>>> a.sum(axis=0)                            # sum of each column
array([12, 15, 18, 21])
>>>
>>> a.min(axis=1)                            # min of each row
array([0, 4, 8])
>>>
>>> a.cumsum(axis=1)                         # cumulative sum along each row
array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]])

Numpy - ufunc 通用函數

NumPy提供常见的数学函数如sin,cos和exp。在NumPy中，这些叫作“通用函数”(ufunc)。在NumPy里这些函数作用按数组的元素运算，产生一个数组作为输出。

>>> b = np.arange(3)
>>> b
array([0, 1, 2])
>>> exp(b)
array([ 1.        ,  2.71828183,  7.3890561 ])
>>> sqrt(b)
array([ 0.        ,  1.        ,  1.41421356])
>>> c = np.array([2., -1., 4.])
>>> add(b, c)
array([ 2.,  0.,  6.])

更多函数all, alltrue, any, apply along axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, conjugate, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sometrue, sort, std, sum, trace, transpose, var, vdot, vectorize, where 参见:NumPy示例

当运算的是不同类型的数组时，结果数组和更普遍和精确的已知(这种行为叫做upcast)。

Numpy -

目前為止, 大概只有提到參考資料內容的 1/5, 待續

參考資料：Python Numpy的数组array和矩阵matrix

Numpy - array

參考資料：How to Index, Slice and Reshape NumPy Arrays for Machine Learning in Python

Array

以上例子會出現

TypeError: can't multiply sequence by non-int of type 'list'

需要將 input_data 設成 np.array, 而 weights 內容不需要, 為什麼？

input_data = ([3, 5])
weights = {'node_0_0': ([2, 4]), 
           'node_0_1': ([ 4, -5]), 
           'node_1_0': ([-1,  2]), 
           'node_1_1': ([1, 2]), 
           'output': ([2, 7])}
input_data * weights['node_0_0'

TypeError: can't multiply sequence by non-int of type 'list'

input_data = np.array([3, 5])
weights = {'node_0_0': ([2, 4]), 
           'node_0_1': ([ 4, -5]), 
           'node_1_0': ([-1,  2]), 
           'node_1_1': ([1, 2]), 
           'output': ([2, 7])}

Numpy - One Hot Encoding

pd.get_dummies 與 np_utils.to_categorical 不同用法

from keras.utils import to_categorical
from keras.utils import np_utils
import numpy

參考資料：Unable to transform string column to categorical matrix using Keras and Sklearn

Python 自修心得

0. 其它

關於 Numpy 的 random 函數

參考資料：为什么你用不好Numpy的random函数？

Numpy - numpy 與 Python 的 random

From Python for Data Analysis, the module numpy.random supplements the Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions.

參考資料：

1. Python Software Foundation random

2. Python - random 模組用法筆記

3. Scipy.org random

4. Stackoverflow Differences between numpy.random and random.random in Python

關於 list

append 跟 extend 的差異 - append

特別注意最後一行, 是將整個 object 當成一個 item 儲存

另外, 在 append 時, 可以省略 index

The list.append method appends an object to the end of the list.

my_list.append(object) 

Whatever the object is, whether a number, a string, another list, or something else, it gets added onto the end of my_list as a single entry on the list.

>>> my_list
['foo', 'bar']
>>> my_list.append('baz')
>>> my_list
['foo', 'bar', 'baz']

So keep in mind that a list is an object. If you append another list onto a list, the first list will be a single object at the end of the list (which may not be what you want):

>>> another_list = [1, 2, 3]
>>> my_list.append(another_list)
>>> my_list
['foo', 'bar', 'baz', [1, 2, 3]]
                     #^^^^^^^^^--- single item on end of list.

參考資料： Difference between append vs. extend list methods in Python

bigdata = data1.append(data2, ignore_index=True)

關於 list

append 跟 extend 的差異 - extend

特別注意最後一行, 是將 'baz' 拆成 3 個 item 儲存

=> 目前為止, 我都用 append （還真不知道 extend 有什麼用途）

The list.extend method extends a list by appending elements from an iterable:

my_list.extend(iterable)
So with extend, each element of the iterable gets appended onto the list. For example:

>>> my_list
['foo', 'bar']
>>> another_list = [1, 2, 3]
>>> my_list.extend(another_list)
>>> my_list
['foo', 'bar', 1, 2, 3]
Keep in mind that a string is an iterable, so if you extend a list with a string, you'll append each character as you iterate over the string (which may not be what you want):

>>> my_list.extend('baz')
>>> my_list
['foo', 'bar', 1, 2, 3, 'b', 'a', 'z']

參考資料： Difference between append vs. extend list methods in Python

關於 dictionary

dict.keys() # keys of dictionary
dict.values() # values of dictionary
如何 merge 兩個 dict, 還需花點時間了解

參考資料： How to merge two dictionaries in a single expression?

關於 pandas

關於 Pandas 的基礎
get_dummies vs to_categorical
Creating Pandas DataFrames from Lists and Dictionaries

關於 pandas 的基礎

Pandas 提供的資料結構
1. Series：用來處理時間序列相關的資料(如感測器資料等)，主要為建立索引的一維陣列。
2. DataFrame：用來處理結構化(Table like)的資料，有列索引與欄標籤的二維資料集，例如關聯式資料庫、CSV 等等。
3. Panel：用來處理有資料及索引、列索引與欄標籤的三維資料集。

參考資料：Pandas 基礎教學
Pandas Indexing and Selecting Data

Object Type	Indexers
Series	s.loc[indexer]
DataFrame	df.loc[row_indexer,column_indexer]
Panel	p.loc[item_indexer,major_indexer,minor_indexer]

關於 pandas 的 Basics

As mentioned when introducing the data structures in the last section, the primary function of indexing with [ ] (a.k.a. __getitem__ for those familiar with implementing class behavior in Python) is selecting out lower-dimensional slices. The following table shows return type values when indexing pandas objects with [ ]:

參考資料：Pandas 基礎教學
Pandas Indexing and Selecting Data

Object Type	Selection	Return Value Type
Series	series[label]	scalar value
DataFrame	frame[colname]	Series corresponding to colname
Panel	panel[itemname]	DataFrame corresponding to the itemname

關於 pandas

12 Useful Pandas Techniques in Python for Data Manipulation
1 – Boolean Indexing
2 – Apply Function
3 – Imputing missing files
4 – Pivot Table
5 – Multi-Indexing
6 – Crosstab
7 – Merge DataFrames
8 – Sorting DataFrames
9 – Plotting (Boxplot & Histogram)
10 – Cut function for binning
11 – Coding nominal data
12 – Iterating over rows of a dataframe

參考資料：12 Useful Pandas Techniques in Python for Data Manipulation

關於 pandas

```
DataFrame 搜尋字串
```
刪去 DataFrame 含有‘特殊字串‘的列 (axis = 0), 或行(axis = 1)
Medium 文章介紹, 簡單、清楚
- [資料分析&機器學習] 第2.3講：Pandas 基本function介紹(Series, DataFrame, Selection, Grouping)
pandas 的官方網站
- pandas: powerful Python data analysis toolkit

參考資料：pandas + dataframe - select by partial string

df = df.drop(df[df['id'].str.contains('特殊字串')].index, 0)

```
df[df['A'].str.contains("hello")]
```

關於 pandas

DataFrame 數字中內含 NaN 時,用 0 (或其它數字, 例如平均數, 取代）

map 功能


df['column'].fillna(0, inplace=True)  # can be another number instead of 0

# or
age_mean = df['age'].mean()
df['age'] = df['age'].fillna(age_mean, inplace=True)

df['sex']= df['sex'].map({'female':0, 'male': 1}).astype(int)

# or
for i in df['Sex']:
    if i=='male':
        male.append(1)
    else:
        male.append(0)
df['sex'] = male

關於 pandas 的 get_dummies() 的功能

```
見下一頁的 to_categorical
```

In[1]: df[:2]
Out[1]:
embarked    survived	pclass	sex	age	sibsp	parch	fare
S        0	1	1	0	29.0000	0	0	211.3375
S        1	1	1	1	0.9167	1	2	151.5500
In[2]: x_OneHot_df = pd.get_dummies(data=df,columns=["embarked" ])
In[3]: x_OneHot_df[:2]
Out[3]: 	
embarked_C	embarked_Q	embarked_S      survived	pclass	sex	age	sibsp	parch	fare
0	0	1        0	1	1	0	29.0000	0	0	211.3375
0	0	1        1	1	1	1	0.9167	1	2	151.5500

# or my dummy way
for i in df['Embarked']:
    if i=='C':
        embarked_from_cherbourg.append(1)
    else:
        embarked_from_cherbourg.append(0)

for i in df['Embarked']:
    if i=='Q':
        embarked_from_queenstown.append(1)
    else:
        embarked_from_queenstown.append(0)

for i in df['Embarked']:
    if i=='S':
        embarked_from_southampton.append(1)
    else:
        embarked_from_southampton.append(0)

df['embarked_from_cherbourg'] = embarked_from_cherbourg
df['embarked_from_queenstown'] = embarked_from_queenstown
df['embarked_from_southampton'] = embarked_from_southampton

關於 keras 的 to_categorical 的功能

```
見上一頁的 get_dummies()
```

from numpy import array 
from numpy import argmax 
from keras.utils import to_categorical 
# define example 
data = [1, 3, 2, 0, 3, 2, 2, 1, 0, 1] 
data = array(data) 
print(data) 
# one hot encode 
encoded = to_categorical(data) 
print(encoded) 
# invert encoding 
inverted = argmax(encoded[0]) 
print(inverted) 

# 原文網址：https://itw01.com/GJFRE5J.html

關於 pandas

dataframe 的
axis = 0 指的是 index
axis = 1 指的是 columns
df.loc['2018-1-1': , ['column1', 'column2']]
df.iloc[0: , ['column1','column2']]
holding_stocks.at[i,'price']= float(stock_price['Close'])
df.cumsum(axis = 0)
df['column1']
df.drop('2018-1-1', index = 0)
df.drop('column1', index =1)

關於 lambda 的使用方式

Unicode, UTF-8, UTF-16, Python 2.x, 3.x 中文編碼

Text

- How to display Chinese in pandas plot?
- [教學]Python Matplotlib 無法顯示中文 (Python初學特訓班、圖表、直線圖) @查理B愛說說

關於 bytes 與 bytearray

bytes & bytearray 是用於處理位元組資料型態

bytes是不可變

bytearray是可改變

兩個型態是保存8bit(byte)的無號整數構成的序列，範圍是0~255

提供了很多與str類似的方法，也支援切片

但用切片存取單一byte會回傳int物件

# byte.py
w=b"abc"
print(w[0])
print(type(w[0]))
print(w[:1])
print(type(w[:1]))



# byte_2.py
w=b"\x74\x61\x69\x70\x65\x69"
print(w)
a=bytes.fromhex("746169706569")
print(a)
print(type(a))
bytearr = bytearray(a)
print(bytearr)
print(type(bytearr))
bytearr.pop()
print(bytearr)
bytearr.pop()
print(bytearr)
bytearr.pop()
print(bytearr)
bytearr.append(110)
print(bytearr)
bytearr.append(97)
print(bytearr)
bytearr.append(ord("n"))
print(bytearr)

關於 Sort

sort by values (in ascending order)
and display 20 highest values

```
>>> y107m01.sort_values(by='mom')
```

>>> y107m01.sort_values(by='mom').tail(20)

關於 csv - 讀取 csv 檔案

此程式會先把csv檔打開，之後透過csv.reader()把每一行的內容用逗號切開，回傳一個list。其實有點像是split，但是cvs.reader()會幫你把" "處理掉。這是單純用split沒辦法做到的事情。
csv module還有提供其他好用的功能。比如說可以幫你把資料parsing成dictionary的格式，使用第一列當作dictionary的key。
也可以自己指定key的名稱：

# -*- coding: utf-8 -*-
import csv
f = open('example.csv', 'r')
for row in csv.reader(f):
    print row
f.close()

參考資料：[Python]讀寫csv檔教學

# -*- coding: utf-8 -*-
import csv
f = open('example.csv', 'r')
for row in csv.DictReader(f, ["日期", "成交股數", "成交金額", "成交筆數", "指數", "漲跌點數"]):
    print row['指數']

關於 csv - 寫至 csv 檔案

dataframe 資料儲存成 csv 檔案, 如果 dataframe 已經有 index 欄位時, 就設 index=False 不需要再重新產生 index
如有設定 index=True, 則於讀取時, 設定成

df.to_csv(file_name, encoding='utf-8', index=False)

參考資料 stackoverflow : Pandas writing dataframe to CSV file

df = pd.read_csv(file_name, encoding='utf-8', index_col=0)

關於 print

Stackoverflow 的 Print multiple arguments in python

Here are some common ways of doing it:

1. Pass it as a tuple:
print("Total score for %s is %s" % (name, score))

2. Pass it as a dictionary:
print("Total score for %(n)s is %(s)s" % {'n': name, 's': score})

There's also new-style string formatting, which might be a little easier to read:

3. Use the new-style string formatting:
print("Total score for {} is {}".format(name, score))

4. Use the new-style string formatting with numbers (useful for reordering or printing the same one multiple times):
print("Total score for {0} is {1}".format(name, score))

5. Use the new-style string formatting with explicit names:
print("Total score for {n} is {s}".format(n=name, s=score))

The clearest two, in my opinion:

6.Pass the values as parameters and print will do it:
print("Total score for", name, "is", score)

7. If you don't want spaces to be inserted automatically by print in the above example, change the sep parameter:
print("Total score for ", name, " is ", score, sep='')

If you're using Python 2, won't be able to use the last two because print isn't a function in Python 2. You can, however, import this behavior from __future__:
from __future__ import print_function
Use the new f-string formatting in Python 3.6:
print(f'Total score for {name} is {score}')

關於 python 的異常

參考資料 : Python 的異常名稱

异常名称	描述
BaseException	所有异常的基类
SystemExit	解释器请求退出
KeyboardInterrupt	用户中断执行(通常是输入^C)
Exception	常规错误的基类
StopIteration	迭代器没有更多的值
GeneratorExit	生成器(generator)发生异常来通知退出
StandardError	所有的内建标准异常的基类
ArithmeticError	所有数值计算错误的基类
FloatingPointError	浮点计算错误
OverflowError	数值运算超出最大限制
ZeroDivisionError	除(或取模)零 (所有数据类型)
AssertionError	断言语句失败
AttributeError	对象没有这个属性
EOFError	没有内建输入,到达EOF 标记
EnvironmentError	操作系统错误的基类
IOError	输入/输出操作失败
OSError	操作系统错误
WindowsError	系统调用失败
ImportError	导入模块/对象失败
LookupError	无效数据查询的基类
IndexError	序列中没有此索引(index)
KeyError	映射中没有这个键
MemoryError	内存溢出错误(对于Python 解释器不是致命的)
NameError	未声明/初始化对象 (没有属性)
UnboundLocalError	访问未初始化的本地变量
ReferenceError	弱引用(Weak reference)试图访问已经垃圾回收了的对象
RuntimeError	一般的运行时错误
NotImplementedError	尚未实现的方法
SyntaxError	Python 语法错误
IndentationError	缩进错误
TabError	Tab 和空格混用
SystemError	一般的解释器系统错误
TypeError	对类型无效的操作
ValueError	传入无效的参数
UnicodeError	Unicode 相关的错误
UnicodeDecodeError	Unicode 解码时的错误
UnicodeEncodeError	Unicode 编码时错误
UnicodeTranslateError	Unicode 转换时错误
Warning	警告的基类
DeprecationWarning	关于被弃用的特征的警告
FutureWarning	关于构造将来语义会有改变的警告
OverflowWarning	旧的关于自动提升为长整型(long)的警告
PendingDeprecationWarning	关于特性将会被废弃的警告
RuntimeWarning	可疑的运行时行为(runtime behavior)的警告
SyntaxWarning	可疑的语法的警告
UserWarning	用户代码生成的警告

How To Best Use Try Except In Python – Especially For Beginners

參考資料：How To Best Use Try Except In Python – Especially For Beginners

關於 python 的疑問

method 的‘正確’位置擺放及是否需要括弧
list.method(object)
- x.append(4) # 修改 x, 增加(append) [4]
dict2 = dict1.method()
- z = x.copy() # 將 x copy 到 z, why not copy(x) or x.copy
- df.info() # df (dataframe) 的 info
module.method
- sys.platform # sys.platform() generate TypeError
df.method(file_name)
- df.to_csv(file_name) # why not to_csv(file_name, df) or file_name.to_csv(df)
- pd.to_numeric(s) # 修改 s (string) 成 numeric
l = method(s)
- l = len(s) # 返回字串 s 到長度到 l
x.counter = 1
- x.counter = 1 # x 是 MyClass 物件下的一個 instance, not x(counter)=1

參考資料：Python 教學文件

Learning Python
by Mark Lutz
自修心得
4/24/2018

參考資料

Book info on O'Reilly
Code for the book on O'Reilly github

Binary Bytes Files

Python struct module can both create and unpack packed binary data
list.met

# coding: utf-8
# In[1]:
import struct
packed = struct.pack('>i4sh', 7, b'spam', 8)
packed
# Out[1]:
b'\x00\x00\x00\x07spam\x00\x08'

# In[2]:
file = open('binary_data.bin', 'wb')
file.write(packed)
file.close()


# In[3]:
data = open('binary_data.bin', 'rb').read()
data
# Out[3]:
b'\x00\x00\x00\x07spam\x00\x08'

# In[4]:
list(data)
# Out[4]:
[0, 0, 0, 7, 115, 112, 97, 109, 0, 8]

# In[5]:
struct.unpack('>i4sh', data)
# Out[5]:
(7, b'spam', 8)

Mastering Python
by Rick van Hattem
A book that can teach you about the more advanced techniques possible within Python
自修心得
4/5/2018

參考資料

Code for the book on GitHub

Chap 1 ：Getting Started
One Environment per Project
Getting a virtual Python environment using venv
Bootstrapping pip using ensurepip
Installing packages based on distutils (C/C++) with pip

Creating a virtual Python Environment

venv :
- distributed with Python 3.3, and simple, straightforward with no feature besides the bare necessities
or preferrably
virtualenv :
- the most significant difference is the wide variety of Pythons that virtualenv supports.
- Support convenient wrapper such as virtualenvwrapper (http://virtualenvwrapper.readthedocs.org/)

# pyvenv test_venv
# . ./test_venv/bin/activate
(test_venv) #

# python3 -m venv test_venv
# . ./test_venv/bin/activate
(test_venv) #

Manual pip install

Download get-pip.py file: http://bootstrap.pypa.io/get-pip.py
Execute the get-pip.py file :

```
# python get-pip.py
```

Install C/C++ packages

Debian and Ubuntu
However, this installs the development headers only ("python.h"). If you want the compiler and other headers bundled with the install, then the build-dep command is also very useful. Here is an example
Red Hat, CentOS, and Fedora
OS X / Windows : read the book

```
# sudo apt-get install python3-dev
```

```
# sudo apt-get build-dep python3
```

# sudo apt-get install python3-devel
# sudo apt-get build-dep python3

Data Science from Scratch
- First Principles with Python by Joel Grus 自修心得
4/6/2018

Import

關於 modules

import module 後, 需要使用 module 的功能, 就需要用 module 的名稱開頭
如果你的程式恰好有用到同樣是 re 的參數, 可以改成別名
module 的名稱太長時, 自然地
如果只需要 module 中特定的 values 時,

import re
my_regex = re.compile("[0-9]+", re.I)

import regex
my_regex = regex.compile("[0-9]+", regex.I)

```
import matplotlib.pyplot as plt
```

from collections import defaultdict, Counter
lookup = defaultdict(int)
my_counter = Counter()

常用的 modules

Python 2.7 預設整數運算, 所以, 5 / 2 = 2. 我們大多會需要浮點運算, 需要
這時候, 如果需要做整數運算時, 改用雙斜線 5 // 2.

```
from __future__ import division 
```

Functions

Python 的 functions 是 first-class (什麼意思?), 我們可以將 function 當變數使用
好奇怪的用法, 其中的 f 是 function
所以, 變成這樣了, 對直覺是一大挑戰

def double(x):
    return x * 2

def apply_to_one(f):
    return f(1)

my_double = double
x = apply_to_one(my_double)    # equals 2

my_double = double
x = apply_to_one(double(1))

Functions 的參數也可以有 default 值

舉例如下

def my_print(message="my default message"):
    print message

my_print("hello")     # prints 'hello'
my_print()            # prints 'my default message'

Python 自修心得

about me

Vocaburary

Python 自修心得

系統定義函數 inbuilt functions

系統定義函數 inbuilt functions

可變/不可變 mutable/immutable

Python 自修心得

Anaconda

Python 自修心得

Numpy - import 方式

Python 自修心得

關於 python 的屬性

關於 python 的屬性 - Tuple / Generator

關於 python 的屬性 - Tuple / List (1/2)

關於 python 的屬性 - Tuple / List (2)

List 與 Tuple 的特性與操作 methods 比較

Python 自修心得

Numpy - import 方式

Numpy - array 矩陣

Numpy - array 矩陣

Numpy - object 屬性 query

Numpy - object 屬性設定

Numpy - 計算/統計

Numpy - ufunc 通用函數

Numpy -

Numpy - array

Array

Numpy - One Hot Encoding

Python 自修心得

Numpy - numpy 與 Python 的 random

關於 list

關於 list

關於 dictionary

關於 pandas

關於 pandas 的基礎

關於 pandas 的 Basics

關於 pandas

關於 pandas

關於 pandas

關於 pandas 的 get_dummies() 的功能

關於 keras 的 to_categorical 的功能

關於 pandas

關於 lambda 的使用方式

Unicode, UTF-8, UTF-16, Python 2.x, 3.x 中文編碼

關於 bytes 與 bytearray

關於 Sort

關於 csv - 讀取 csv 檔案

關於 csv - 寫至 csv 檔案

關於 print

關於 python 的異常

How To Best Use Try Except In Python – Especially For Beginners

關於 python 的疑問

Learning Python

參考資料

Binary Bytes Files

Mastering Python

參考資料

Chap 1 ：Getting Started

Creating a virtual Python Environment

Manual pip install

Install C/C++ packages

Data Science from Scratch

- First Principles with Python by Joel Grus 自修心得​

Import

關於 modules

常用的 modules

Functions

Functions 的參數也可以有 default 值

- First Principles with Python by Joel Grus 自修心得