Python 2 vs. Python 3: A retrospective
dropbox.comI'd really like to see the video for these slides. But here's what caught my interest:
Set and dict comprehensions
{x**2 for x in range(10)}
{x: x**2 for x in range(10)}
Why reduce() must die:
... the applicability of reduce() is pretty much limited
to associative operators, and in all other cases it's
better to write out the accumulation loop explicitly.
int [divided by] int should return float
nonlocal
Explicit nonlocal variable modifier which (I guess) "promotes" the variable outside its local scope. Kind of the inverse of Java requiring 'final' to bind a variable to a closure.You know what's funny? All of those things could have been done in Python 2.8, apart from the int division change. And the division change does as much harm as good, because lots of people use Python and also use another language where int division works the "old fashioned way"; for them (me) this change is counter-productive because it adds a pointless distinction. It is a great change for programming novices, for sure, but that's only part of Python's audience, and probably won't be the longest-lived part.
> another language where int division works the "old fashioned way"; for them (me) this change is counter-productive ... It is a great change for programming novices
Douglas Crockford made a point in a recent interview[1] (not the first time, I'm sure) that this is exactly the wrong reason to keep doing things "the way it's always been". Other examples he mentions: line endings (CR/LF), integer overflow, short vs long. Big vs little endian would be another obvious example.
Fred Brooks (Mythical Man Month) calls this "accidental complexity".
> [novices are] only part of Python's audience, and probably won't be the longest-lived part
By definition. But, it's not really a good use of anyone's time to be dealing with truncation in a time when it no longer has any reason to be the default except historical accident.
[1] http://hanselminutes.com/396/bugs-considered-harmful-with-do...
I have tracked down many confounding bugs caused by people accidentally using integer division. This change is one of the main reasons I use python 3; it gives me one less thing to worry about, because integer division (which is only correct occasionally) stands out clearly with //.
> All of those things could have been done in Python 2.8
The first one is in 2.7 (it's completely backwards compatible). The division is in 2.5 or 2.6 imported from __future__. #4 skirts the line, new keywords have been added in point release in the past. #2 wouldn't fly.
But here's the thing, P3 was not about these (or not only about them), they got bundled in because P3 was allowed to be significantly backwards-incompatible and thus a lot of changes became acceptable which were not justifiable or much harder to justify in a point release. The primary breakage point of P3 is not any of these, it's the string changes.
All of these things, including the int division "thing" are present in python 2.7 already. For division, you need to import it:
And then dividing numbers works a bit more logical (well I think it's more logical). You can do division with rounding with the // operator: a // b, and it even works with floats: (3.9 // 1.2) == 3from __future__ import divisionThe division change is good. I have a hard time understanding why you'd want
as the default behaviour.3/2 = 1Imagine the following (admittedly bad, minimalistic to make my point) code
If the division does a "naturaL" thing, you suddenly have a float "polluting" your integer algorithm, but it's _not consistent_. If the user enters "4", you get an int back. If they enter 5, you get a float.x = int(input(">>> ")) a = x / 2 append_int_to_magical_db(a)> If the user enters "4", you get an int back.
No, it always returns a float in Python 3. 4/2 gives 2.0.
Users of Python 3 and up will just have to remember that the result of any division will be a float.
It's a breaking change, yes, but in general I think it's a good one.
And if they do want integer division, they have to remember // and %.
It's bad and it does not make your point. In P2 you always get an int, in P3 you always get a float.
a = x // 2
The one thing I wish they could change in the future is forcing list and dictionary iterable same syntax:
instead of writing
for index, element in enumerate(some_list)
for key, value in my_dict.items()
they should unify and make items and enumerate default behavior. i.e.
for index, element in my_list:
for key, value in my_dict:
I really don't see the benefit of not doing this as default behavior. I always find if I need to loop a list there is a good chance the index can help, and even if I don't need it it doesn't hurt to have one either. Simple is better. And the whole looping dict and get back the key only sucks too because you often need the value as well so you essentially do dict["key"] but why not just default return both key and value?
Explicit is better than implicit.
I think you think for is magical and could be modified like this, but really for just iterates over something. It's enumerate and items that are the magic. Enumerate zips a range onto a list, the ``index,element`` unpacks the zip. Items returns a list of (key,value) tuples and the key,value unpacks that.
You couldn't modify the iterators because it would effect EVERYTHING. sum([1,1,1]) would now be sum([(1,1),(2,1),(3,1)]) AHH! And ``key in dict`` wouldn't work any more, since the iterator would return key,values. EEK.
I never said it would be easy to implement or whether it would be actually possible. My complain is that what we are doing right now isn't convenient and is counter-intuitive. I don't write programming language so I wouldn't know how difficult it would be to change the grammar and the semantic of for.
Being simple vs explicit is a political debate. I prefer if Python has simpler magical syntax.
That's totally against the ethos of python. I think you'll find that very few python developers would side with you on that change. If I'm iterating over something I'd like the elements of the thing I'm iterating over, not some weird results based on the type of iterator. In the (extremely) rare cases I need an index I wrap the list/whatever in the enumerate function.
What are the use cases where you frequently need the index? I write and read a lot of python and I almost never see it. Maybe if you're working in a problem domain where it's a common issue you could create some abstraction to better handle it for you.
Whoever downvoted me have some weird negativity here.
It doesn't against ethos of Python. There is a Zens of Python, I don't know Ethos of Python. http://www.python.org/dev/peps/pep-0020/
Beautiful is better than ugly. Writing .items() or surrounding enumerate() does not make your code look prettier, does it?
Explicit is better than implicit. In fact, my proposal is more explicit than the implicit of for key in my_dict or for item in my list. I think returning both key,value are more explicit than say remembering the default return value in looping a dict is a key, not value, or relying on remembering enumerate or items.
It is simpler to write such code, and if you just need one of the return values, just that one, they are named.
You want a use case. I will give you one: find the location of a specific item in the list. and if you use django, you know your template engine can read your psudeo python template code. And you often need that index.
my proposal is more explicit
One could just as easily say that your proposal is less explicit, because a sequence is a sequence of items, not (index, item) tuples, yet you're making the "for" iteration yield tuples.
Similarly, a dict is a container of keys, not (key, value) pairs. The reason is that otherwise you would be unable to check to see if a key was in the dict unless you knew the value that went with it: but in that case why would you need the dict? (Technically, you could still iterate over the entire dict looking for your key, but that's extremely slow; the whole point of having a dict is to be able to do fast lookups of keys in order to retrieve their values.)
Writing .items() or surrounding enumerate() does not make your code look prettier, does it?
Neither does having to extract just one item from a tuple when I don't need the index. There's no way around the fact that one of the types of iteration (either just items, or index, item tuples) is going to have to be spelled with something extra. So just saying "I shouldn't have to type something extra" isn't a sufficient argument. You need to justify why your preferred type of iteration should be the one with the shorter spelling: and since your preferred type of iteration has extra baggage attached to it, it seems perfectly legitimate to me that it should have the longer spelling, not the shorter one.
find the location of a specific item in the list
That's what the "index" method is for.
I forget if it's Jinja or Django's template, but one of them has a function that only exists inside of for loops that returns the current index. I thought that was a great way to handle it, since it's always available without changing any syntax.
In retrospect, I suppose the .index() method would be fine, since Lists can't hold two of the same object, and Dicts return the key by default.
Lists can't hold two of the same object
Yes, they can. Try it! The index method has extra arguments to let you specify a range of indexes in the list to search, so you can find multiple indexes that point to the same object (by picking a range that excludes indexes you've previously found).
Dicts return the key by default
Dicts don't have an index method; dict keys are not ordered.
I almost never use enumerate in python - 90%+ of the time it's the values I'm interested in - an exception might be if I'm trying to print a numbered list - and it's just handy to have the index instead of incrementing a counter. In your scenario, though, to find the location of a specific item in a list:
>>> b=['one','two','three','four','two'] >>> b.count("two") 2 >>> b.index("two") 1 >>> b.index("two",2) 4Try building binary search with .index()
I didn't downvote you, but I feel I should. You're suggesting making a magical change to a language because you seem to have failed to grasp the way it works.
If the best use case you can come up with is the one you've given then we've got real issues. If my team were ever iterating over a list to find the index of an item I would be very upset. That is categorically not the right way to do it.
Regarding the ethos - zen, ethos, call it what you will. Explicit is better. Maybe English isn't your first language but your idea is less explicit than the way it works T te moment.
The reason why Python works the way it does is specifically to keep the magical syntax as simple as possible.
Consider the following:
Right now, it works unambiguously -- just the way you'd expect. The above prints:menu = [("Apples", 5), ("Cream Pie", 2), ("Tea and scones", 3)] for food, price in menu: print "To buy %s, please pay %d dollars" % (food,price)
What if we implemented your rule? Would the intprereter print the above, or would it say this?To buy Apples, please pay 5 dollars. To buy Cream Pie, please pay 2 dollars. To buy Tea and scones, please pay 3 dollars.
What if you wanted to print the first one? If your syntax were implemented, the programmer would have to write something awful likeTo buy 0, please pay ("Apples", 5) dollars. To buy 1, please pay ("Cream Pie", 2) dollars. To buy 2, please pay ("Tea and scones", 3) dollars.
or evenfor index, (food, price) in menu:
which puts us in full circle again!for food, price in destructuring_without_index(menu):The reason why most of us don't like your idea is because it introduces ambiguity and doesn't even remove the trade-off. No matter how you implement it, there's going to be a trade-off.
The zen of python, by Tim Peters:
Explicit is better than implicit. Special cases aren't special enough to break the rules. In the face of ambiguity, refuse the temptation to guess.
And ``key in dict`` wouldn't work any more
The "in" operator is actually a separate operator; it doesn't depend on the dict iterator implementation. So you could keep the semantics of the two separate.
However, if the dict iterator semantics were changed, it would make sense to change the semantics of "in" as well--and since it doesn't really make sense to change the semantics of "in" (if you're asking if something is "in" the dict, you mean the key), that would explain why the dict iterator semantics are the way they are: the dict is a container of keys, not key-value pairs.
I've thought that dict.items() should be the default behavior for a while, but haven't really thought it through. The 'key in dict' example is interesting--is the expected semantics of 'x in y' for sequence types the same as 'any(x == z for z in y)'? That doesn't hold true for strings, as an example.
is the expected semantics of 'x in y' for sequence types the same as 'any(x == z for z in y)'?
For sequences that aren't strings, yes. :-) Strings are a sort of hybrid between "atomic" values and sequences, so their semantics can be different.
I always find if I need to loop a list there is a good chance the index can help
You may always find this, but lots of people don't. I find myself doing lots of iteration that doesn't need the index.
and even if I don't need it it doesn't hurt to have one either
Yes it does, because you're adding extra code to compute the index to every iteration, whether it's needed or not. It's not a good idea to encumber such a basic language construct with any extra baggage; that's why the extra baggage is in "enumerate", so you only get it if you actually need it.
First, keep in mind that as others have said, there is no "magic" involved here.
`list(seq)` will return a list of elements. `list(dct)` will return a list of keys. A for-loop will always loop over the abstract "list representation" of the object (the list result itself is just an accumulation of values from a for-loop).
So your first suggestion would make no sense whatsoever. As for the second...one could maybe argue for that. Many people end up using `dct.keys()` anyway when they want the keys, and `.items()` is so common that it could maybe be made the default. Ruby actually does this by default. However, it would break a ton of current code where people expect to be looping over the keys.
If you're proposing that I can write either `[x for x in X]` for iterating without indices and `[i,x for x in X]` (or `[i,x for i,x in X]` then you introduce ambiguity (what if it's a list of 2-tuples?). If you mean that I always have to write `[i,x` etc even if I don't need the indices then that introduces a lot of noise for the distant minority case.
Dictionaries also have the same ambiguity problem (tuples can be dictionary keys). The noise problem is a bit more justified, but the entire construct is less needed since `D[k]` is less clunky and error-prone than `L.index`.
my_dict is a dictionary while items() and friends return an iterable. Your proposed change would lead to ambiguity and confusion- what determines how we should iterate (what does it return, when is it done)? How would we define custom iterables?
Can anyone else not read the last lines of some of the slides?
Here is a version with no problems (I used the Box View API): https://view-api.box.com/view/VoRxuIIQel26CLNAgt8KskrQxgUpwD...
Dropbox's powerpoint viewer isn't perfect. I had to download the pptx. Works fine in Keynote.
Thanks for pointing out it's a Dropbox thing--at first I thought it might be a Chrome problem. Specifically the viewer seems not to use the proper font for this presentation. Preview on Mac OS works fine.
What was worse for me was the font kerning on some things (supposed to be fixed width?) was amazingly atrocious.
I'm kinda impressed though. I don't think I have a program that will view pptx files, so I was happy to be able to read it online.
It's some PPTX -> PDF converter and then it's just pdf.js.
Ubuntu here, tried with LibreOffice, same issue. Had to reupload it to google drive, problem solved: https://drive.google.com/file/d/0B8MSXu_W6_e4ZE02ZUQ5dU03cXM...
I am also having this problem.
"People positively hate incompatible changes – especially bad for dynamic languages", "Never again this way – the future is static analysis and annotations".
Wouldn't it be better to pick a better-suited language then?
John Carmack put it nice way:
"One of the lessons that we took away from Doom 3 was that script interpreters are bad, from a performance, debugging, development standpoint. It’s kind of that argument “oh but you want a free-form dynamically typed language here so you can do all of your quick, flexible stuff, and people that aren’t really programmers can do this stuff”, but you know one of the big lessons of a big project is you don’t want people that aren’t really programmers programming, you’ll suffer for it!"
Epic might disagree, the UnrealEngine is heavily scriptable. This was one of its major selling points in the last generation.
I have a huge respect for Carmack but some other people prooved him wrong in the past. His opinions are often taken as gospel but more discreet people (like Sweeney) may have different and a s worthy points of view.
Why does Guido think that slices syntax is screwed up? I mean, it's not exactly natural, but at least it's consistent (first bound is included, second is excluded):
a = '12345'
a[0:-1] == '1234'
a[-1:0:-1] == '5432'
Personally, I think that "downcounting" slices are rarely used. For code clarity, I prefer reversing the string/list first.This came up on python-ideas recently, there was a long thread: https://mail.python.org/pipermail/python-ideas/2013-October/...
I would have been more interested in learning Python if there wasn't such a great divide. I read the first chapter of several books that said "Python 3 is out, but we're going to stick with 2.7 because too much shit is broken".
That was true early on in Python 3, but it's not true now.
People have been saying this for three years, but it's not true now.
As someone who learned Python when 3.2 came out, I completely agree with you. I have only really used Python 2.7!
Because too much shit is broken (NumPy, hello). Because Python 3 has been the default on basically no system ever (OK, maybe this is changing right now, slowly).
As Guido says, it's been five years and it will take another five. This whole experiment has been a huge misstep for Python, an absolutely massive gaffe. Some of Python's peers did it too, roughly around the same time (Perl, and to a lesser extent Ruby).
Python (Guido?) noticed its own maturity a bit too late. The damage is incredible; along with the performance stuff (which is in a way easier to overcome) this may be a key factor leading to the fall of a great language.
On the other hand, my experience has been very different: I learned Python when 3.2 was current as well, using Lutz' "Learning Python", which takes the approach of "teach Python 3, and explain how 2 is different whenever necessary". I've followed suit and taken the approach of writing Python 3 code first, and to make it work on 2.7 only when I need to, which I found fairly easy to do, though it can make the code a bit uglier sadly (writing cross-version-compatible metaclass code is the one that annoys me, since it adds some verbosity).
I'm looking forward to 2.x dying out to eliminate that retrofitting step (and it's happening: the improving dependency landscape means I find I have to do it less and less often), but I've not experienced any major pain overall. From where I'm sitting, Python 3 is a better, cleaner language, and as someone new to Python, I'm happier for it.
My story is similar, and I learned Python 3.2 for Numpy.
NumPy works fine on python 3.
Are those books, books that came out this year?
Question for the professional Python developers: do you (on a daily/weekly/monthly) basis switch between projects in Python 2 and Python 3? Is that hard to do (e.g. do you have to constantly and consciously remind yourself of syntax/semantic differences), or does your mind sort of automatically adjust to the new/old patterns?
This is my problem with Python: "Rename func_name —> __name__, etc Rename .next() —> .__next__()"
Too many ugly renames, too few alternatives of doing things. To be honest the only attractive thing to me is all the libraries that they support but I don't find the language itself interesting.
What do you mean by "ugly renames?" How many times in using Python did you have to look at .__next__() or .func_name ?
I have been using Python full time for the last 7 years and I very rarely have to call either of those function.
> Too few alternatives of doing things.
Can you explain that as well? What do you mean by alternatives of doing things? Like say you want to read a file and you might want to use a wider variety of options when opening the file handle or say you want to parse JSON and you'd like standard library to have more parsers available?
Not having alternatives is python's greatest strength. The language is easy to read because as much as possible there is only one way to write a given concept.
> too few alternatives of doing things
That's what you want for maintainability. I'm not interested in maintaining a codebase where every programmer have their own idea of how something should be done. Of course, you have code reviews for this kind of thing. Except when the code is already written. And when it is not, arguing over minor points is an unnecessary timesink.
__ is basically a namespace for official language extensions. How would you suggest they do it? Prevent "next()" from being a valid method name?
That has been addressed in some many other ways by several languages that goes from the C++ way where you actually have namespaces to the C way where you don't worry about it and pick another name. From all of them I find this the most odd way to address it, specially when python was supposed to improve legibility by design (at least for me those underscores are very distracting)
The underscores are distracting on purpose; any method surrounded by double underscores is one that you're virtually never supposed to explicitly call (there's a builtin `next(foo)` in Python 3, for instance)
Er... they are methods not functions. Tell me how does the C++ way namespace methods within an object?
There's no need to make it not be valid. C++ uses begin() and end() for obtaining iterators to containers, but nothing's stopping you from using those method names for your own purposes.
It's just that if you want to use a few new language niceties like range-based for loops then you'll need to conform to that convention.
In C++ it's fairly common to be calling begin() and end() on containers, where in Python it's not common to call next(), you let the for loop handle it. It's reasonable to rename a function to something ugly when you're never going to be seeing it.
That's the best time, so that people don't think it's a function they should be calling without understanding exactly what they're doing.
> where in Python it's not common to call next()
In fact it should never happen, that's what the `iter` and `next` builtins are for. The only use case for calling __next__ by hand is iffy as hell, it's overloading it while inheriting from an iterator.
whoa did he just say static analysis is the future?
yeah, I'd love to understand the reason behind that statement.
PyPy?
pypy is dynamic analysis.
Semantics. Pypy's RPython dialect is "a restricted subset of Python that is amenable to static analysis", to quote the PyPy website.
RPython and Pypy are different things.
RPython is a restricted subset of python indeed, but its purpose is to be a toolkit for implementing virtual machines. It is not and does not aim to be a general-purpose programming environment.
PyPy is a JITed Python runtime implemented in RPython.
The relation between PyPy and RPython is more or less the relation between CPython and C.
That line made me feel warm and fuzzy inside. He also mentionned mypy which i thought was a one man lonesome soon to be abandonned ( but absolutely fantastic) project.
Could someone create/convert the PDF version? I don't think my computer good enough to open PPT* format
Edit: not really needed, just loaded the dropbox preview, it is still readable
Angry noscript user here. Visit URL... almost blank page... with non-working download button. Enable Javascript... Get .pdf named .pptx.
> Angry noscript user
Redundant, I think.
As a noscript user, I think you're probably right.
Does Python 3 fix the import system? Can I import a file from any location in the file system?
you always could, see __import__ and imp module.
Now that I get to use Go, I have no desire to go back to python. I think I'm not alone, and that python will soon enough become the new perl.
Suggesting that python will soon enough become the new perl is high praise indeed. Perl runs a good portion of the internet, and is (and will likely for the next half century) be a very, very popular language.
It's definitely the case that python was shoehorned into some places where it probably wasn't the best fit, but, at the time (1999-2001) was really the high-level dynamic language that had a lot of mindshare. A lot (almost all?) of companies that tried to use Zope as their application server back then would probably be looking at Java as their deployment platform today.
Python sits in that nice "batteries included, easy to read, reasonably fast to write" space. I tend to write most of my scripts in Python, because a week later, I could never understand the perl code I wrote, but, for some reason, python code never had that problem for me.
I don't expect too many people are writing quick "one-off" scripts in Go (I'd be interested to be proven wrong though), so perl/python/awk/bash all still have a place in people's toolkit.
The Python ecosystem is far too diversified to get knocked down by one language. Python has a wealth of production-quality libraries across a ton of domains (Web, scientific computing, data science, NLP, parsing, scripting/automating, etc). Go is a non-entity in most of these domains and isn't even a top-20 programming language on Github (source: http://sogrady-media.redmonk.com/sogrady/files/2013/07/progr...)
Python went everywhere Perl did and then expanded the map for "scripting languages". This didn't happen by accident: Python is, by design, very easy comprehend and learn. The Python community also one of the most newbie-friendly around, with mountains of freely available resources for beginners.
A programming language cannot be sustained by uber hackers, PLT nerds, and hipsters alone. You gotta make it reach the world (Like JavaScript, Java) or it'll never be"Tier 1" programming language. I have yet to see the Go's developer or community put forth a strategy to make this happen -- which is completely understandable given how new the language is.
The scientific Python community will never leave 2.5 -> 2.7, though...
The science community will get there. The problem is that for a long time, NumPy/SciPy/etc. didn't support 3.x. And when you're more interested in end results, why would you rewrite your code (or spend days/weeks/months relearning) when you could use a still perfectly acceptable and supported version?
There's also a lot of reuse and expansion on existing code bases, which would involve a lot of work to migrate to 3.x. There's also the matter that on top of moving to 3.x, you also have the task of making sure there's no hidden bugs that may alter your results in ways you may not notice. A lot of scientific code has been repeatedly vetted to make sure that there's no bias or glitches that may skew your results.
Hell, I know astronomers who are still using Fortran code that was written in the 80's. It still works (though now it requires a long build process, as it is no longer compatible with the latest Fortran compilers), so no reason to try to rewrite it just because the language is dated.
> The science community will get there. The problem is that for a long time, NumPy/SciPy/etc. didn't support 3.x.
They were some of the earliest ported widely-used packages. Numpy was ported in 2010, scipy very early in 2011 (except for weave iirc)
Half the core scientific packages are already P3-compatible. Numpy has supported P3 since 2010[0] and scipy since early 2011[0][1]
[0] http://sourceforge.net/projects/numpy/files//NumPy/1.5.0/NOT... https://pypi.python.org/pypi/numpy/1.5.0
[1] http://docs.scipy.org/doc/scipy-0.9.0/reference/release.0.9....
Im just going by the (very limited sample) people ive talked to.
Yeah that's true. Getting everyone to 3 is going to be quite a slog. I have a feeling once Python 3 becomes the default Python installation for OSX and Linux systems you'll see a big uptick in adoption (and probably some abandonment as well).
NLP will definitely be on Python3 sooner rather than later. unicode strings is a really killer feature for us.
I'm hearing more rumblings about using 3.x now. Certainly people in this community aren't racing to switch, but I wouldn't say it will never happen.
Python should not directly compete with go. A sure sign of an inexperienced dev is "why I moved from Lang x to Lang y". You should use and know multiple languages, and know where they fit in. That said, for a lot of cases I think Go steps in where python comes up short, which is definitely performance and concurrency. But if I'm doing something that's too complicated to fit easily in a shell script, but doesn't warrant a lot of time, the brevity of python wins every time.
Python was never the right answer for the cases Go solves (OCaml was a better choice 5-10 years ago, and nowadays there are plenty of options). But it's still the best language around for where you just need to write a one-off script as quickly as possible (in that sense it's always been the new perl).
For a one-off script I still prefer Perl. But where I run into Python more is scientific computing. It seems to be making inroads into areas that previously would've only used Matlab. Haven't seen any Go in that area yet. Of the new entrants, Julia seems to be building buzz.
e.g. Sage: http://sagemath.org/
empty dict: {:} empty set: {}
Ah, the road not taken.
It seems like I've been reading about the difficulties of Python V2 -> V3 for awhile. Why is that? Is this Python upgrade unusually difficult/ambitious? Or is the Python community just very reluctant to jump on new things?
For the casual user, Pythonv3 is in many way backward incompatible and, for those type of users, who know python V2, and were happy with it, and whose full time job was something else - learning a new version of the language, particularly one without all the supporting libraries of pythonV2, didn't really have much return on investment.
Imagine someone who was used to doing something like this:
And then discovering this no longer works because the strip function has been removed from the string module (and moved to the string object). If all they want to do was a quick read of a file and then parse - they are not going to dive into Python3 and figure out the new way to do it - they'll just stick with pythonv2.import string list=open('foo.txt').readlines() list_strip=map(string.strip,list)It's a big, ambitious update. The biggest difference is forcing users to distinguish between strings and byte sequences; essentially programs now have to be encoding-aware (at least if they use any of the standard library functions). Which is a Good Thing, but can require a ton of work for existing codebases.
It's not that ambitious- none of the changes are particularly compelling, none of them scream "update now".
It does, on the other hand, break backwards compatibility. Which is why hardly anyone updated.
Maybe not in the world of ASCII, but the new Unicode system scream seems pretty loud to me.
When I decided to use pelican for a non-English blog, I thought it would be piece of cake; just changing the theme and plugging a calendar converter and I would be done with it. In reality, I had to fork pelican and the calendar library (which was not well-maintained) and bang my head to the wall for three days to make them work together, all because of the whole string/unicode seperation and the fact that things work automagically as long as you're just using ASCII.
Does this get easier or harder in python 3?
I like the explicit separation that Racket has between "here is a buffer of binary data" and "here is a sequence of Unicode characters," and (looking on the outside without working with it), I'm glad that Python 3 began to adopt some of that.
smnrchrds's case of fixing someone else's ASCII assumption gets easier, because the code probably would not have been written that way.
In Python 2, it's really easy to write code that confuses bytes and characters, which introduces bugs and crashes when non-ASCII characters show up.
In Python 3, they made it easier to work with Unicode, because it's the default for everything, and much harder to confuse bytes and characters, because of that separation between the data types.
ding ding ding!! It was not ambitious enough for breaking backward compatibility.
I've seen many programmers having trouble with this. But it's essential when using UTF-8, because sgtring presentations length might be is different from byte length. So byte != char != int (0-255). It's hard to get for some coders who are used that all of those datatyps are the same.
Network effects.
When Python 3 started out, it had almost no libraries. So most projects at the top of the software stack couldn't use Python 3 because all the libraries they needed only worked with Python 2.
Below the top, to this day any individual library that wants to switch to 3 basically has to maintain separate forks for 2 and 3, because a lot of downstream users still use 2 because not all libraries are 3-compatible yet.
I think the Python developers are crazy for not using the proven __future__ import mechanism to allow new features to be introduced gradually and have new code interoperating with old code.
In Ruby world, some people are still on 1.8.7, even though it was end-of-life'd in June, and it was known that it would be on that date a few years before.
Change is hard. People don't want to spend the money on upgrading. It's almost all downside with very little upside.
Around January this year, my employer was still getting requests for Java 1.4!
Don't ever upgrade a running system until you are forced to do so.
IMHO: mostly the problem is splitting the bytes/unicode types that were lumped in Python2
http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/
> Is this Python upgrade unusually difficult/ambitious?
Yes.
> Or is the Python community just very reluctant to jump on new things?
In my experience no. Lots of features have been added over the years, and they seem to be adopted quickly. v3 is a really big change though.
As a dev that hasn't been able to move to Py3 yet (but will soon), I'm wishing they'd fix the rest of the issues, bundle PyPy, and ship Python 4 instead! Make it a compelling upgrade.
Then Py3 could be nicknamed "Vista," I suppose.