digits = datasets.load_digits()
n_samples = len(digits.images)
X = digits.images.reshape((n_samples, -1))
y = digits.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0)
tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
'C': [1, 10, 100, 1000]},
{'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
scores = ['precision', 'recall']
for score in scores:
clf = GridSearchCV(SVC(), tuned_parameters, cv=5, scoring='%s_macro' % score)
clf.fit(X_train, y_train)
print("Best parameters set found on development set:", clf.best_params_)
means = clf.cv_results_['mean_test_score']
stds = clf.cv_results_['std_test_score'],
for mean, std, params in zip(means, stds, clf.cv_results_['params']):
print("%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params))
y_true, y_pred = y_test, clf.predict(X_test)
print(classification_report(y_true, y_pred))
Sets of values.
Int: a set of all integer numbers.
Float: a set of all floating point numbers.
DbConnection: a set of all possible database connections.
To constrain the set of values a variable can hold:
n: int = 42
x: float = 42.5
rex: Dog = Dog("rex")
To constrain the set of values a function accepts as parameters:
def foo(x: int):
return x + 1
To constrain the set of values a function can return:
def foo() -> float:
return np.random.random_sample()
The built-in simple types:
n: int = 42
x: float = 42.5
b: bool = True
s: str = 'hello world'
And built-in complex types:
from typing import Dict, List, Set, Tuple
xs: List[int] = [1,2,3,4,5]
d: Dict[str, str] = {'hello': 'world', 'ala': 'makota'}
s: Set[float] = {1.5, 2.5, 3.5}
t: Tuple[int, float, str] = (42, 42.5, "fortytwo")
Notice that for complex types you need to provide the type of the elements inside.
Btw, this is called a "kind", i.e. "a type of type".
For functions and lambdas (in callbacks and HOFs).
def f(x: int, y: int) -> float:
pass
my_fun: Callable[[int, int], float] = f
my_other_fun: Callable[[str], bool] = lambda s: s.startswith('http')
The first parameter is a list of argument types. The second is return type.
It's all about being honest. Consider this function:
def get_db_connection(conn_str: str) -> DbConnection:
try:
return my_db.connect(conn_str)
except DbError as e:
log.error(e)
return None
It's a nice function. It's also a liar. Its type is not DbConnection, because the set of values it can return is larger: it also contains None.
from typing import Optional
def get_db_connection(conn_str: str) -> Optional[DbConnection]:
try:
return my_db.connect(conn_str)
except DbError as e:
log.error(e)
return None
Previously we said: "this function can return a DbConnection OR a None". This extends to other types as well:
def get_db_connection(conn_str: str) -> Union[PSQLConnection, CouchConnection, None]:
try:
if conn_str.startswith('psql'):
return my_psql_db.connect(conn_str)
return my_couch_db.connect(conn_str)
except DbError as e:
log.error(e)
return None
AVOID union types.
A type that's a Union of everything possible is called Any.
A class is a type by itself.
class Foo:
def __init__(self, x: int): # self is untyped
self.x = x
def foo(self) -> int:
return self.x + 10
@classmethod
def create(cls) -> 'Foo': # note the ''
return Foo(random.randint(10, 20))
def process_foo(f: Foo) -> Foo:
print(f.foo())
f.x += 1
return f
foo: Foo = Foo.create()
foo2: Foo = process_foo(foo)
"Give me whatever, I'll give you back the same whatever"
E.g. what's the type of List.reverse?
For a List[int] it returns List[int].
For a List[Foo] it returns List[Foo].
from typing import List, TypeVar
T = TypeVar('T')
def my_reverse(xs: List[T]) -> List[T]:
xs.reverse()
The typevar is a name for "whatever type".
Notice how List[T] is very different from List[Any].
List[Any] could be: [1, 1.5, "hello"].
List[T] has all of the elements of the same type.
T = TypeVar('T')
def my_fun1(xs: List[T]) -> List[T]:
# cannot modify elements of the list!
BONUS QUESTION:
Why my_fun1 cannot modify elements of the list?
Sometimes the names are cumbersome to type.
def create_user_mapping() -> Optional[Iterable[Dict[str, User]]]:
pass
We can give a name to that type:
UserMapping = Optional[Iterable[Dict[str, User]]]
def create_user_mapping() -> UserMapping:
pass
And save some keystrokes in the long run.
How to make sure we never treat inches as centimeters? Both are really floats in the program...
from typing import NewType
Inch = NewType('Inch', float)
Cm = NewType('Cm', float)
def inch2cm(x: Inch) -> Cm:
pass
def cm2inch(x: Cm) -> Inch:
pass
def inch2cm(x):
return x * 2.54
x_in_cm = 4.23
y = inch2cm(x_in_cm)
That makes no sense...
IDEs like PyCharm do this on the fly. They also provide excellent type-driven autocompletion.
There's also MyPy:
$ pip install mypy
$ mypy test.py