Senthil Kumaran
Hacker, Python Core Developer and an Experimenter.
Senthil Kumaran
Python Core Developer.
Maintainer of urllib2 and related modules.
Software Engineer at Akamai, India.
Python 3.2 ( py3k and will become trunk)
Python 3.1.2 (release31-maint)
Python 3.0 is not there!
Python 2.7 (current trunk)
Python 2.6.5 (release26-maint)
GuidoVan Rossum - 05/April/2006 (It's June/2010 now. Lot of thought has gone in)
Python 3.0 will break the backwards compatiblity with Python 2.x
Python 2.6+ to support forward compatiblity.
Implementation Details – Write in C.
Source code conversion tool (2to3), context free source to source translation.
Python 3000, Python 3.0 and Py3K are all names for the same thing. The project is called Python 3000, or abbreviated to Py3k. The actual Python release will same naming convention is used for Python 2.x.
Guido planned that Python 3.1 and 3.2 to be released much sooner after 3.0 than it has been customary for the 2.x series. The 3.x release pattern will stabilize once the community is happy with 3.x.
Actually Python 3.0 shippped with io module written in Python which caused significant performance problems. Python Core Devs recommend against the use of Python 3.0 for any purpose. Use Python 3.1 which is better.
Python 3.x will be break the backward compability, but Python 2.6 and Python 2.7 will support forwarding compatibility in the following ways.
It will support a "Py3k warnings mode" which will warn dynamically (i.e. at runtime) about features that will stop working in Python 3.0, e.g. assuming that range() returns a list.
It will contain backported versions of many Py3k features, either enabled through __future__ statements or simply by allowing old and new syntax to be used side-by-side (if the new syntax would be a syntax error in 2.x).
2to3 source conversion tool is provide.This tool can do a context-free source-to-source translation. For example, it can translate
apply(f, args)intof(*args). However, the tool cannot do data flow analysis or type inferencing, so it simply assumes thatapplyin this example refers to the old built-in function.
The recommended development model for a project that needs to support Python 2.7 and 3.2 simultaneously is as follows:
You should have excellent unit tests with close to full coverage.
Port your project to Python 2.7.
Turn on the Py3k warnings mode.
Test and edit until no warnings remain.
Use the 2to3 tool to convert this source code to 3.2 syntax. Do not manually edit the output!
Test the converted source code under 3.2.
If problems are found, make corrections to the 2.6 version of the source code and go back to step 3.
When it's time to release, release separate 2.6 and 3.2 tarballs (or whatever archive form you use for releases).
It is recommended not to edit the 3.2 source code until you are ready to reduce 2.6 support to pure maintenance (i.e. the moment when you would normally move the 2.6 code to a maintenance branch anyway).
Brett Canon – Mischellaneous Python 3 Plans.
This is overall implementation details, kind of index to all the other PEPS which will detail the specific changes.
One PEP to Check them all.
PEP 238 (Changing the Division Operator)
PEP 328 (Imports: Multi-Line and Absolute/Relative)
PEP 343 (The "with" Statement)
PEP 352 (Required Superclass for Exceptions)
http://www.python.org/dev/peps/pep-3100/
Python Regrets presentation: http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.pdf
Talin – Advanced String Formatting
Current ones - '%' and string.Template methods.
'{0},{1}'.format('hello','world')
Details the format specifiers options.
[[fill]align][sign][#][0][minimumwidth][.precision][type]
User defined formatting and other issues and usecases.
Make print a function
Good move!
Rationale for making print as a function.
Signature of the print function
def print(*args, sep=' ', end='\n', file=None)
No backwards compatible. All your python 2 code which contains print statement will break.
At some point in application development one quite often feels the need to replace
print()function, this is a straightforward string replacement, today it is a mess adding all those parentheses and possibly converting>>streamstyle syntax.
Having special syntax for
printf()function is not too far fetched when it will coexist with aprint()function.There's no easy way to convert
If
print()is a function, it would be much easier to replace it within one module (justdef print(*args):...) or even throughout a program (e.g. by putting a different function in__builtin__.print). As it is, one can do this by writing a class with awrite()method and assigning that tosys.stdout-- that's not bad, but definitely a much larger conceptual leap, and it works at a different level than print.Luckily, as it is a statement in Python 2,
Guido – Changes to dict.keys(), .values() and .items()
It returns a set like unordered container objects whose contents are derived from underlying dictionary rather than a list which is a copy of keys etc.
Remove iterkeys, itervalues and iteritems method.
Inspired by the Java Collections Framework.
Revamping dict.keys(), .values() and .items()
This PEP proposes to change the .keys(), .values() and .items() methods of the built-in dict type to return a set-like or unordered container object whose contents are derived from the underlying dictionary rather than a list which is a copy of the keys, etc.; and to remove the .iterkeys(), .itervalues() and .iteritems() methods.
The approach is inspired by that taken in the Java Collections Framework.
The objects returned by the .keys() and .items() methods behave like sets. The object returned by the values() method behaves like a much simpler unordered collection -- it cannot be a set because duplicate values are possible.
Because of the set behavior, it will be possible to check whether two dicts have the same keys by simply testing:
if a.keys() == b.keys(): ...and similarly for .items().
Standard Library Reorganisation.
Removal of obsolete modules, restructuring of the modules,
Modules removed – cfmfile, cf, md5, mimetools, MimeWriter, mimify, multifile, posixfile, rfc822, sha, sv, timing.Removal of platform specific modules.
Modules and Code with PEP8 Violation
It is kept track as bug tracker issue.
Changes to Python's Exception Raise mechanism.
Intended to reduce Line Noise as well as size of the language.
Consolidation of several raise forms – raise X,Y and raise X(Y) which are equivalent in py2.
Changes to raise exception syntax and throw method in generator object.
To hold exception object raise E as e
One of Python's guiding maxims is "there should be one -- and preferably only one -- obvious way to do it". Python 2.x's
raisestatement violates this principle, permitting multiple ways of expressing the same thought. For example, these statements are equivalent:
raise E, V raise E(V)
There is a third form of the raise statement, allowing arbitrary tracebacks
to be attached to an exception.
raise E, V, T
where T is a traceback. As specified in PEP 344,
exception objects in Python 3.x will possess a ``__traceback__``
attribute, admitting this translation of the three-expression
``raise`` statement: ::
raise E, V, T
is translated to ::
e = E(V)
e.__traceback__ = T
raise eUsing these translations, we can reduce the
raisestatement from four forms to two:
raise(with no arguments) is used to re-raise the active exception in anexceptsuite.
raise EXCEPTIONis used to raise a new exception. This form has two sub-variants:EXCEPTIONmay be an exception class or an instance of an exception class; valid exception classes are BaseException and its subclasses. IfEXCEPTIONis a subclass, it will be called with no arguments to obtain an exception instance.To raise anything else is an error.
Changes to catching Exceptions.
except E, N: except E as n:
Open issues – Replacing or Dropping sys.exc_info()
http://mail.python.org/pipermail/python-dev/2006-March/062449.html
Simple input builtin for python
No longer input() and raw_input()
Its just one, which is raw_input() renamed to just the input()
Specification
The existing
raw_input()function will be renamed toinput().The Python 2 to 3 conversion tool will replace calls to
input()witheval(input())andraw_input()withinput().BDFL Thoughts on this issue:
http://mail.python.org/pipermail/python-3000/2006-December/005249.html
I like the exact proposal made here better than any of the alternatives mentioned so far.
Against naming it readline(): the "real" readline doesn't strip the n and returns an empty string for EOF instead of raising EOFError; I believe the latter is more helpful for true beginners' code.
Against naming it ask() and renaming print() to say(): I find those rather silly names that belong in toy or AI languages. Changing print from statement to function maintains Pythonicity; renaming it say() does not.
I don't expect there will be much potential confusion with the 2.x input(); that function is used extremely rarely. It will be trivial to add rules to the refactoring tool (sandbox/2to3/) that replace input() with eval(input()) and replace raw_input() with input().
Guido
Literal syntax for bytes object - b'bytes'
Convenient way to spell ascii strings and arbitrary binary data.
bytes('Hello world', 'ascii')
'Hello world'.encode('ascii')
The proposed syntax is::
b'Hello world'
Existing spellings of an 8-bit binary sequence in Python 3000 include::
bytes([0x7f, 0x45, 0x4c, 0x46, 0x01, 0x01, 0x01, 0x00])
bytes('\x7fELF\x01\x01\x01\0', 'latin-1')
'7f454c4601010100'.decode('hex')
The proposed syntax is::
b'\x7f\x45\x4c\x46\x01\x01\x01\x00'
b'\x7fELF\x01\x01\x01\0'In both cases, the advantages of the new syntax are brevity, some small efficiency gain, and the detection of encoding errors at compile time rather than at runtime. The brevity benefit is especially felt when using the string-like methods of bytes objects:
lines = bdata.split(bytes('\n', 'ascii')) # existing syntax lines = bdata.split(b'\n') # proposed syntaxAnd when converting code from Python 2.x to Python 3000:
sok.send('EXIT\r\n') # Python 2.x sok.send('EXIT\r\n'.encode('ascii')) # Python 3000 existing sok.send(b'EXIT\r\n') # proposed
Removal of tuple parameter unpacking.
def fxn(a, (b, c), d): pass
Remove the ablity to pass a container to be received as tuple. More harmful than useful.
Introspection issues. Un informative error messages, little use-cases.
Tuple parameter unpacking is the use of a tuple as a parameter in a function signature so as to have a sequence argument automatically unpacked. An example is:
def fxn(a, (b, c), d): passThe use of
(b, c)in the signature requires that the second argument to the function be a sequence of length two (e.g.,[42, -13]). When such a sequence is passed it is unpacked and has its values assigned to the parameters, just as if the statementb, c = [42, -13]had been executed in the parameter.
Unfortunately this feature of Python's rich function signature abilities, while handy in some situations, causes more issues than they are worth. Thus this PEP proposes their removal from the language in Python 3.0.
The 2to3 refactoring tool 2to3 will gain a fixer fixer for translating tuple parameters to being a single parameter that is unpacked as the first statement in the function. The name of the new parameter will be changed. The new parameter will then be unpacked into the names originally used in the tuple parameter. This means that the following function
def fxn((a, (b, c))):
pass
will be translated into::
def fxn(a_b_c):
(a, (b, c)) = a_b_c
passAs tuple parameters are used by lambdas because of the single expression limitation, they must also be supported. This is done by having the expected sequence argument bound to a single parameter and then indexing on that parameter:
lambda (x, y): x + ywill be translated into:
lambda x_y: x_y[0] + x_y[1]
Renaming iterator.next() to iterator.__next__()
In python 2.x, next() is a protocol, not a method.
Its for consistency sake with other implicitly called methods.
next(x) → x.__next__()
iter(function, sentinel) <--> next(iterator, sentinel)
Following this pattern, the natural way to handle
nextis to add anextbuilt-in function that behaves in exactly the same fashion.
next(x)--> internaltp_iternext-->x.__next__()Further, it is proposed that the
nextbuilt-in function accept a sentinel value as an optional second argument, following the style of thegetattranditerbuilt-in functions. When called with two arguments,nextcatches the StopIteration exception and returns the sentinel value instead of propagating the exception. This creates a nice duality betweeniterandnext:iter(function, sentinel) <--> next(iterator, sentinel)
There have been a few objections to the addition of more built-ins. In particular, Martin von Loewis writes:
I dislike the introduction of more builtins unless they have a true generality (i.e. are likely to be needed in many programs). For this one, I think the normal usage of __next__ will be with a for loop, so I don't think one would often need an explicit next() invocation.
It is also not true that most protocols are explicitly invoked through builtin functions. Instead, most protocols are can be explicitly invoked through methods in the operator module. So following tradition, it should be operator.next.
...
As an alternative, I propose that object grows a .next() method, which calls __next__ by default.
Implementation and Transformation
Method definitions named
nextwill be renamed to__next__.Explicit calls to the
nextmethod will be replaced with calls to the built-innextfunction. For example,x.next()will becomenext(x).
Metaclasses in Python 3
Changes to Syntax for Declaring metaclasses and alters the semantics of how classes with metaclasses are constructed.
class C(B1, B2, metaclass=MC, *more_bases, **kwds): pass
Information on metaclass
http://www.python.org/doc/essays/metaclasses/meta-vladimir.txt http://stackoverflow.com/questions/100003/what-is-a-metaclass-in-python http://www.voidspace.org.uk/python/articles/metaclasses.shtml
def make_hook(f): """Decorator to turn 'foo' method into '__foo__'""" f.is_hook = 1 return f class MyType(type): def __new__(cls, name, bases, attrs): if name.startswith('None'): return None # Go over attributes and see if they should be renamed. newattrs = {} for attrname, attrvalue in attrs.iteritems(): if getattr(attrvalue, 'is_hook', 0): newattrs['__%s__' % attrname] = attrvalue else: newattrs[attrname] = attrvalue return super(MyType, cls).__new__(cls, name, bases, newattrs) def __init__(self, name, bases, attrs): super(MyType, self).__init__(name, bases, attrs) # classregistry.register(self, self.interfaces) print "Would register class %s now." % self def __add__(self, other): class AutoClass(self, other): pass return AutoClass # Alternatively, to autogenerate the classname as well as the class: # return type(self.__name__ + other.__name__, (self, other), {}) def unregister(self): # classregistry.unregister(self) print "Would unregister class %s now." % self class MyObject: __metaclass__ = MyType class NoneSample(MyObject): pass # Will print "NoneType None" print type(NoneSample), repr(NoneSample) class Example(MyObject): def __init__(self, value): self.value = value @make_hook def add(self, other): return self.__class__(self.value + other.value) # Will unregister the class Example.unregister() inst = Example(10) # Will fail with an AttributeError #inst.unregister() print inst + inst class Sibling(MyObject): pass ExampleSibling = Example + Sibling # ExampleSibling is now a subclass of both Example and Sibling (with no # content of its own) although it will believe it's called 'AutoClass' print ExampleSibling print ExampleSibling.__mro__
New I/O Module.
Specification for Basic Byte based I/O Stream to which we can add further buffering and text handling features.
Enables the programmers to treat stream like interfaces in very stream like manner instead of just for file-like object or for socket interface obj.
The new I/O spec is influenced by Java's I/O libraries.
Open() factory method backwards compatible with old IO.
Python allows for a variety of stream-like (a.k.a. file-like) objects that can be used via
read()andwrite()calls. Anything that providesread()andwrite()is stream-like. However, more exotic and extremely useful functions likereadline()orseek()may or may not be available on every stream-like object. Python needs a specification for basic byte-based I/O streams to which we can add buffering and text-handling features.Once we have a defined raw byte-based I/O interface, we can add buffering and text handling layers on top of any byte-based I/O class. The same buffering and text handling logic can be used for files, sockets, byte arrays, or custom I/O classes developed by Python programmers. Developing a standard definition of a stream lets us separate stream-based operations like
read()andwrite()from implementation specific operations likefileno()andisatty(). It encourages programmers to write code that uses streams as streams and not require that all streams support file-specific or socket-specific operations.The new I/O spec is intended to be similar to the Java I/O libraries, but generally less confusing. Programmers who don't want to muck about in the new I/O world can expect that the
open()factory method will produce an object backwards-compatible with old-style file objects.
Revising the buffer protocol.
Improving the ways Python allows memory sharing in Python 3.
It is useful for extension modules.
Guido – Introduction of Abstract Base Classes (ABC).
Way to overload isinstance and issubclass.
A new module abc which is to support the Abstract Base Class framework.
Specific ABCs for Containers and Iterators to be added to collections module.
What makes a set, mapping, sequence?
Companion PEP 3141 for numeric types.
ABCs are simply Python classes that are added into an object's inheritance tree to signal certain features of that object to an external inspector. Tests are done using isinstance(), and the presence of a particular ABC means that the test has passed.
In addition, the ABCs define a minimal set of methods that establish the characteristic behavior of the type. Code that discriminates objects based on their ABC type can trust that those methods will always be present. Each of these methods are accompanied by an generalized abstract semantic definition that is described in the documentation for the ABC. These standard semantic definitions are not enforced, but are strongly recommended.
Like all other things in Python, these promises are in the nature of a gentlemen's agreement, which in this case means that while the language does enforce some of the promises made in the ABC, it is up to the implementer of the concrete class to insure that the remaining ones are kept.
Using UTF-8 as the default source Encoding.
Martin von Löwis
Default Source Encoding of Python scripts from ASCII to UTF-8.
IDLE Changed to use UTF-8 as default encoding.
The parser needs to be changed to accept bytes > 127 if no source encoding is specified; instead of giving an error, it needs to check that the bytes are well-formed UTF-8 (decoding is not necessary, as the parser converts all source code to UTF-8, anyway).
Introduction of Class Decorators.
Collin Winters, Jack Diederich
Its an extension of function and method decorators.
To make certain constructs more easily expressed and less reliant on Cpython Interpretor implementation.
Its simpler than using metaclasses for class to acheive the same purpose.
It is an extension to the function and method decorators introduced in PEP 318.
When function decorators were originally debated for inclusion in Python 2.4, class decorators were seen as obscure and unnecessary thanks to metaclasses. After several years' experience with the Python 2.4.x series of releases and an increasing familiarity with function decorators and their uses, the BDFL and the community re-evaluated class decorators and recommended their inclusion in Python 3.0.
The motivating use-case was to make certain constructs more easily expressed and less reliant on implementation details of the CPython interpreter. While it is possible to express class decorator-like functionality using metaclasses, the results are generally unpleasant and the implementation highly fragile. In addition, metaclasses are inherited, whereas class decorators are not, making metaclasses unsuitable for some, single class-specific uses of class decorators. The fact that large-scale Python projects like Zope were going through these wild contortions to achieve something like class decorators won over the BDFL.
Examination of Decorators - PEP 318
Extended Iterable unpacking
>>>a, *b, c = range(5) >>>a 0 >>>c 4 >>>b [1, 2, 3]
New super
Syntactic Suger for the super.
super().foo(1, 2) instead of super(Foo, self).foo(1, 2)
Guido – Immutable bytes and Mutable buffer
bytes is a immutable array of bytes.
bytearray is a mutable array of bytes.
memoryview is a bytes view of another object.
String representation in Py3k.
use repr() `backticks` gone – its same as repr() Repr is Unicode based and not ASCII based in py3
Context sensitive source code to source code translator.
Useful to convert the py2 source to py3.
Use it.
Look or str, bytes issues after the translation. Fix those and fix the interfaces (phew!).
You are done.
Greog Brandl – April-2006- Things which will not change in Python 3000.
BDFL pronoucements on things which will not happen.
Things such as 'self' will not become implicit.
Parser wont be more complex than LL
No braces, no more backticks
The interpreter prompt will not change. It gives Guido warm fuzzy feelings..
Python language Moratorium.
Other Python Implementations to Catch up.
Changes put into Python be used by large user-base.
Does not allow for Exceptions to changes in the syntax and semantics of the language.
Defines what can change and what cannot change.
Improved GIL by Antoine Pitrou
Unladen Swallow Integration.
Excellent Test Coverage - Michael Frood.
An Excellent Language where there is more symmetry.
Symmetry is a complexity-reducing concept (co-routines include subroutines); seek it everywhere.
Alan J. Perlis — Epigrams on Programming, Sept., 1982
By Senthil Kumaran
Python 3 PEPS