Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts

Monday, 6 October 2025

Setting up a Python learning environment: Docker, pytest, and ruff

G'day:

I'm learning Python. Not because I particularly want to, but because my 14-year-old son Zachary has IT homework and I should probably be able to help him with it. I've been a web developer for decades, but Python's never been part of my stack. Time to fix that gap.

This article covers getting a Python learning environment set up from scratch: Docker container with modern tooling, pytest for testing, and ruff for code quality. The goal is to have a proper development environment where I can write code, run tests, and not have things break in stupid ways. Nothing revolutionary here, but documenting it for when I inevitably forget how Python dependency management works six months from now.

The repo's at github.com/adamcameron/learning-python (tag 3.0.2), and I'm tracking this as Jira tickets because that's how my brain works. LP-1 was the Docker setup, LP-2 was the testing and linting toolchain.

Getting Docker sorted

First job was getting a Python container running. I'm not installing Python directly on my Windows machine - everything goes in Docker. This keeps the host clean and makes it easy to blow away and rebuild when something inevitably goes wrong.

I went with uv for dependency management. It's the modern Python tooling that consolidates what used to be pip, virtualenv, and a bunch of other stuff into one fast binary. It's written in Rust, so it's actually quick, and it handles the virtual environment isolation properly.

The docker-compose.yml is straightforward:

services:
    python:
        build:
            context: ..
            dockerfile: docker/python/Dockerfile

        volumes:
            - ..:/usr/src/app
            - venv:/usr/src/app/.venv

        stdin_open: true
        tty: true

volumes:
    venv:

The key bit here is that separate volume for .venv. Without it, you get the same problem as with Node.js - the host's virtual environment conflicts with the container's. Using a named volume keeps the container's dependencies isolated while still letting me edit source files on the host.

The Dockerfile handles the initial setup:

FROM astral/uv:python3.11-bookworm

RUN echo "alias ll='ls -alF'" >> ~/.bashrc
RUN echo "alias cls='clear; printf \"\033[3J\"'" >> ~/.bashrc

RUN ["apt-get", "update"]
RUN ["apt-get", "install", "-y", "vim"]

WORKDIR  /usr/src/app

ENV UV_LINK_MODE=copy

RUN \
    --mount=type=cache,target=/root/.cache/uv \
    --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
    uv sync \
    --no-install-project

ENTRYPOINT ["bash"]

Nothing fancy. The astral/uv base image already has Python and uv installed. I'm using Python 3.11 because it's stable and well-supported. The uv sync at build time installs dependencies from pyproject.toml, and that cache mount makes rebuilds faster.

The ENTRYPOINT ["bash"] keeps the container running so I can exec into it and run commands. I'm used to having PHP-FPM containers that stay up with their own service loop, and this achieves the same thing.

One thing I'm doing here differently from usual, is that I am using mount to temporarily expose files to the Docker build process. In the past I would have copied pyproject.toml into the image file system. Why the change? Cos I did't realise I could do this until I saw it in this article I googled up: "Using uv in Docker › Intermediate layers"! I'm gonna use this stragey from now on, I think…

Project configuration and initial code

Python projects use pyproject.toml for configuration - it's the equivalent of package.json in Node.js or composer.json in PHP. Here's the initial setup:

[project]
name = "learning-python"
version = "0.1"
description = "And now I need to learn Python..."
readme = "README.md"
requires-python = ">=3.11"
dependencies = []

[project.scripts]
howdy = "learningpython.lp2.main:greet"

[build-system]
requires = ["uv_build>=0.8.15,<0.9.0"]
build-backend = "uv_build"

[tool.uv.build-backend]
namespace = true

The project.scripts section defines a howdy command that calls the greet function from learningpython.main. The syntax is module.path:function. This makes the function callable via uv run howdy from the command line.

The namespace = true bit tells uv to use namespace packages, which means I don't need __init__.py files everywhere. Modern Python packaging is less fussy than it used to be.

The actual code in src/learningpython/lp2/main.py is about as simple as it gets:

def greet():
    print("Hello from learning-python!")

if __name__ == "__main__":
    greet()

Nothing to it. The if __name__ == "__main__" bit means the function runs when you execute the file directly, but not when you import it as a module. Standard Python pattern.

With all this in place, I could build and run the container:

$ docker compose -f docker/docker-compose.yml up --detach
[+] Running 2/2
 ✔ Volume "learning-python_venv"  Created
 ✔ Container learning-python-python-1  Started

$ docker exec learning-python-python-1 uv run howdy
Hello from learning-python!

Right. Basic container works, simple function prints output. Time to sort out testing.

Installing pytest and development dependencies

Python separates runtime dependencies from development dependencies. Runtime deps go in the dependencies array, dev deps go in [dependency-groups]. Things like test frameworks and linters are dev dependencies - you need them for development but not for running the actual application.

To add pytest, I used uv add --dev pytest. This is the Python equivalent of composer require --dev in PHP or npm install --save-dev in Node. The --dev flag tells uv to put it in the dev dependency group rather than treating it as a runtime requirement.

I wanted to pin pytest to major version 8, so I checked PyPI (pypi.org/project/pytest/) to see what was current. As of writing it's 8.4.2. Python uses different version constraint syntax than Composer - instead of ^8.0 you write >=8.0,<9.0. More verbose but explicit.

I also wanted a file watcher like vitest has. There's pytest-watch but it hasn't been maintained since 2020 and doesn't work with modern pyproject.toml files. There's a newer alternative called pytest-watcher that handles the modern Python tooling properly.

After running uv add --dev pytest pytest-watcher, the pyproject.toml updated to include:

[dependency-groups]
dev = [
    "pytest>=8.4.2,<9",
    "pytest-watcher>=0.4.3,<0.5",
]

The uv.lock file pins the exact versions that were installed, giving reproducible builds. It's the Python equivalent of composer.lock or package-lock.json.

Writing the first test

pytest discovers test files automatically. It looks for files named test_*.py or *_test.py and runs functions in them that start with test_. No configuration needed for basic usage.

I created tests/lp2/test_main.py to test the greet() function. The test needed to verify that calling greet() outputs the expected message to stdout. pytest has a built-in fixture called capsys that captures output streams:

from learningpython.lp2.main import greet

def test_greet(capsys):
    greet()
    captured = capsys.readouterr()
    assert captured.out == "Hello from learning-python!\n"

The capsys parameter is a pytest fixture - you just add it as a function parameter and pytest provides it automatically. Calling readouterr() gives you back stdout and stderr as a named tuple. The \n at the end is because Python's print() adds a newline by default.

Running the test:

$ docker exec learning-python-python-1 uv run pytest
======================================= test session starts ========================================
platform linux -- Python 3.11.13, pytest-8.4.2, pluggy-1.6.0
rootdir: /usr/src/app
configfile: pyproject.toml
collected 1 item

tests/lp2/test_main.py .                                                                     [100%]

======================================== 1 passed in 0.01s =========================================

Green. The test found the pyproject.toml config automatically and discovered the test file without needing to tell it where to look.

For continuous testing, pytest-watcher monitors files and re-runs tests on changes:

$ docker exec learning-python-python-1 uv run ptw
[ptw] Watching directories: ['src', 'tests']
[ptw] Running: pytest
======================================= test session starts ========================================
platform linux -- Python 3.11.13, pytest-8.4.2, pluggy-1.6.0
rootdir: /usr/src/app
configfile: pyproject.toml
collected 1 item

tests/lp2/test_main.py .                                                                     [100%]

======================================== 1 passed in 0.01s =========================================

Any time I change a file in src or tests, it automatically re-runs the relevant tests. Much faster feedback loop than running tests manually each time.

Code formatting and linting with ruff

Python has a bunch of tools for code quality - black for formatting, flake8 for linting, isort for import sorting. Or you can just use ruff, which consolidates all of that into one fast tool written in Rust.

Installation was the same pattern: uv add --dev ruff. This added "ruff>=0.8.4,<0.9" to the dev dependencies.

ruff has two main commands:

  • ruff check - linting (finds unused variables, style issues, code problems)
  • ruff format - formatting (fixes indentation, spacing, line length)

Testing it out with some deliberately broken code:

$ docker exec learning-python-python-1 uvx ruff check src/learningpython/lp2/main.py
F841 Local variable `a` is assigned to but never used
 --> src/learningpython/lp2/main.py:2:8
  |
1 | def greet():
2 |        a = "wootywoo"
  |        ^
3 |        print("Hello from learning-python!")
  |
help: Remove assignment to unused variable `a`

It caught the unused variable. It also didn't complain about the 7-space indentation, because ruff check is about code issues, not formatting. That's what ruff format is for:

$ docker exec learning-python-python-1 uvx ruff format src/learningpython/lp2/main.py
1 file reformatted

This fixed the indentation to Python's standard 4 spaces. The check command can also auto-fix some issues with --fix, similar to eslint.

I configured IntelliJ to run ruff format on save. Had to disable a conflicting AMD Adrenaline hotkey first - video driver software stealing IDE shortcuts is always fun to debug. It took about an hour to work out WTF was going on there. I really don't understand why AMD thinks its driver software needs hotkeys. Dorks.

A Python gotcha: hyphens in paths

I reorganised the code by ticket number, so I moved the erstwhile main.py to src/learningpython/lp-2/main.py. Updated the pyproject.toml entry point to match:

[project.scripts]
howdy = "learningpython.lp-2.main:greet"

This did not go well:

$ docker exec learning-python-python-1 uv run howdy
      Built learning-python @ file:///usr/src/app
Uninstalled 1 package in 0.37ms
Installed 1 package in 1ms
  File "/usr/src/app/.venv/bin/howdy", line 4
    from learningpython.lp-2.main import greet
                            ^
SyntaxError: invalid decimal literal

Python's import system doesn't support hyphens in module names. When it sees lp-2, it tries to parse it as "lp minus 2" and chokes. Module names need to be valid Python identifiers, which means letters, numbers, and underscores only.

Renaming to lp2 fixed it. No hyphens in directory names if those directories are part of the import path. You can use hyphens in filenames that you access directly (like python path/to/some-script.py), but not in anything you're importing as a module.

This caught me out because hyphens are fine in most other ecosystems. Coming from PHP and JavaScript where some-module-name is perfectly normal, Python's stricter rules take some adjustment.

Wrapping up

So that's the development environment sorted. Docker container running Python 3.11 with uv for dependency management. pytest for testing with pytest-watcher for continuous test runs. ruff handling both linting and formatting. All the basics for writing Python code without things being annoying.

The final project structure looks like this:

learning-python/
├── docker/
│   ├── docker-compose.yml
│   └── python/
│       └── Dockerfile
├── src/
│   └── learningpython/
│       └── lp2/
│           └── main.py
├── tests/
│   └── lp2/
│       └── test_main.py
├── pyproject.toml
└── uv.lock

Everything's on GitHub at github.com/adamcameron/learning-python (tag 3.0.2).

Now I can actually start learning Python instead of fighting with tooling. Which is the point.

Righto.

--
Adam

Tuesday, 28 November 2017

That array_map quandary implemented in other languages

G'day:
A coupla days ago I bleated about array_map [having] a dumb implementation. I had what I thought was an obvious application for array_map in PHP, but it couldn't really accommodate me due to array_map not exposing the array's keys to the callback, and then messing up the keys in the mapped array if one passes array_map more than one array to process.

I needed to remap this:

[
    "2008-11-08" => "Jacinda",
    "1990-10-27" => "Bill",
    "2014-09-20" => "James",
    "1979-05-24" => "Winston"
]

To this:

array(4) {
  '2008-11-08' =>
  class IndexedPerson#3 (2) {
    public $date =>
    string(10) "2008-11-08"
    public $name =>
    string(7) "Jacinda"
  }
  '1990-10-27' =>
  class IndexedPerson#4 (2) {
    public $date =>
    string(10) "1990-10-27"
    public $name =>
    string(4) "Bill"
  }
  '2014-09-20' =>
  class IndexedPerson#5 (2) {
    public $date =>
    string(10) "2014-09-20"
    public $name =>
    string(5) "James"
  }
  '1979-05-24' =>
  class IndexedPerson#6 (2) {
    public $date =>
    string(10) "1979-05-24"
    public $name =>
    string(7) "Winston"
  }
}

Note how the remapped object also contains the original key value. That was the sticking point. Go read the article for more detail and more whining.

OK so my expectations of PHP's array higher order functions are based  on  my experience with JS's and CFML's equivalents. Both of which receive the key as well as the value in all callbacks. I decided to see how other languages achieve the same end, and I'll pop the codee in here for shits 'n' giggles.


CFML

Given most of my history is as a CFML dev, that one was easy.

peopleData = ["2008-11-08" = "Jacinda", "1990-10-27" = "Bill", "2014-09-20" = "James", "1979-05-24" = "Winston"]

people = peopleData.map((date, name) => new IndexedPerson(date, name))

people.each((date, person) => echo("#date# => #person#<br>"))

Oh, this presupposes the IndexedPerson component. Due to a shortcoming of how CFML works, components must be declared in a file of their own:

component {

    function init(date, name) {
        this.date = date
        this.name = name
    }

    string function _toString() {
        return "{date:#this.date#; name: #this.name#}"
    }
}


But the key bit is the mapping operation:

people = peopleData.map((date, name) => new IndexedPerson(date, name))

Couldn't be simpler (NB: this is Lucee's CFML implementation, not ColdFusion's which does not yet support arrow functions).

The output is:


2008-11-08 => {date:2008-11-08; name: Jacinda}
1990-10-27 => {date:1990-10-27; name: Bill}
2014-09-20 => {date:2014-09-20; name: James}
1979-05-24 => {date:1979-05-24; name: Winston}

Also note that CFML doesn't have associative arrays, it has structs, so the keys are not ordered. This does not matter here. (Thanks to Zac for correcting me here: CFML does have ordered structs these days).


JS

The next language I turned to was JS as that's the I'm next most familiar with. One thing that hadn't occurred to me is that whilst JS's Array implementation has a map method, we need to use an object here as the keys are values not indexes. And whilst I knew Objects didn't have a map method, I didn't know what the equivalent might be.

Well it turns out that there's no real option to use a map here, so I needed to do a reduce on the object's entries, Still: it's pretty terse and obvious:

class IndexedPerson {
    constructor(date, name) {
        this.date = date
        this.name = name
    }
}

let peopleData = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

let people = Object.entries(peopleData).reduce(function (people, personData) {
    people.set(personData[0], new IndexedPerson(personData[0], personData[1]))
    return people
}, new Map())

console.log(people)

This returns what we want:

Map {
  '2008-11-08' => IndexedPerson { date: '2008-11-08', name: 'Jacinda' },
  '1990-10-27' => IndexedPerson { date: '1990-10-27', name: 'Bill' },
  '2014-09-20' => IndexedPerson { date: '2014-09-20', name: 'James' },
  '1979-05-24' => IndexedPerson { date: '1979-05-24', name: 'Winston' } }

TBH I think this is a misuse of an object to contain basically an associative array / struct, but so be it. It's the closest analogy to the PHP requirement. I was able to at least return it as a Map, which I think is better. I tried to have the incoming personData as a map, but the Map prototype's equivalent of entries() used above is unhelpful in that it returns an Iterator, and the prototype for Iterator is a bit spartan.

I think it's slightly clumsy I need to access the entries value via array notation instead of some sort of name, but this is minor.

As with all my code, I welcome people showing me how I should actually be doing this. Post a comment. I'm looking at you Ryan Guill ;-)

Java

Next up was Java. Holy fuck what a morass of boilterplate nonsense I needed to perform this simple operation in Java. Deep breath...

import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;

class IndexedPerson {
    String date;
    String name;
    
    public IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }
    
    public String toString(){
        return String.format("{date: %s, name: %s}", this.date, this.name);
    }
}

class Collect {

    public static void main(String[] args) {

        HashMap<String,String> peopleData = loadData();

        HashMap<String, IndexedPerson> people = mapToPeople(peopleData);
            
        dumpIdents(people);
    }
    
    private static HashMap<String,String> loadData(){
        HashMap<String,String> peopleData = new HashMap<String,String>();
        
        peopleData.put("2008-11-08", "Jacinda");
        peopleData.put("1990-10-27", "Bill");
        peopleData.put("2014-09-20", "James");
        peopleData.put("1979-05-24", "Winston");
        
        return peopleData;
    }
    
    private static HashMap<String,IndexedPerson> mapToPeople(HashMap<String,String> peopleData) {
        HashMap<String, IndexedPerson> people = (HashMap<String, IndexedPerson>) peopleData.entrySet().stream()
            .collect(Collectors.toMap(
                e -> e.getKey(),
                e -> new IndexedPerson(e.getKey(), e.getValue())
            ));
            
        return people;
    }
    
    private static void dumpIdents(HashMap<String,IndexedPerson> people) {
        for (Map.Entry<String, IndexedPerson> entry : people.entrySet()) {
            System.out.println(String.format("%s => %s", entry.getKey(), entry.getValue()));
        }
    }
    
}

Result:
1979-05-24 => {date: 1979-05-24, name: Winston}
2014-09-20 => {date: 2014-09-20, name: James}
1990-10-27 => {date: 1990-10-27, name: Bill}
2008-11-08 => {date: 2008-11-08, name: Jacinda}

Most of that lot seems to be just messing around telling Java what types everything are. Bleah.

The interesting bit - my grasp of which is tenuous - is the Collectors.toMap. I have to admit I derived that from reading various Stack Overflow articles. But I got it working, and I know the general approach now, so that's good.

Too much code for such a simple thing though, eh?


Groovy

Groovy is my antidote to Java. Groovy makes this shit easy:

class IndexedPerson {
    String date
    String name

    IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }

    String toString(){
        String.format("date: %s, name: %s", this.date, this.name)
    }
}

peopleData = ["2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"]

people = peopleData.collectEntries {date, name -> [date, new IndexedPerson(date, name)]}

people.each {date, person -> println String.format("%s => {%s}", date, person)}

Bear in mind that most of that is getting the class defined, and the output. The bit that does the mapping is just the one line in the middle. That's more like it.

Again, I don't know much about Groovy… I had to RTFM to find out how to do the collectEntries bit, but it was easy to find and easy to understand.

I really wish I had a job doing Groovy.

Oh yeah, for the sake of completeness, the output was thus:

2008-11-08 => {date: 2008-11-08, name: Jacinda}
1990-10-27 => {date: 1990-10-27, name: Bill}
2014-09-20 => {date: 2014-09-20, name: James}
1979-05-24 => {date: 1979-05-24, name: Winston}


Ruby

Ruby's version was pretty simple too as it turns out. No surprise there as Ruby's all about higher order functions and applying blocks to collections and stuff like that.

class IndexedPerson

    def initialize(date, name)
        @date = date
        @name = name
    end

    def inspect
        "{date:#{@date}; name: #{@name}}\n"
    end
end

peopleData = {"2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"}

people = peopleData.merge(peopleData) do |date, name|
    IndexedPerson.new(date, name)
end

puts people

Predictable output:

{"2008-11-08"=>{date:2008-11-08; name: Jacinda}
, "1990-10-27"=>{date:1990-10-27; name: Bill}
, "2014-09-20"=>{date:2014-09-20; name: James}
, "1979-05-24"=>{date:1979-05-24; name: Winston}
}

I wasn't too sure about all that block nonsense when I first started looking at Ruby, but I quite like it now. It's easy to read.


Python

My Python skills don't extend much beyond printing G'day World on the screen, but it was surprisingly easy to google-up how to do this. And I finally got to see what Python folk are on about with this "comprehensions" stuff, which I think is quite cool.

class IndexedPerson:
    def __init__(self, date, name):
        self.date = date
        self.name = name

    def __repr__(self):
        return "{{date: {date}, name: {name}}}".format(date=self.date, name=self.name)

people_data = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

people = {date: IndexedPerson(date, name) for (date, name) in people_data.items()}

print("\n".join(['%s => %s' % (date, person) for (date, person) in people.items()]))


And now that I am all about Clean Code, I kinda get the "whitespace as indentation" thing too. It's clear enough if yer code is clean in the first place.

The output of this is identical to the Groovy one.

Only one more then I'll stop.

Clojure

I can only barely do G'day World in Clojure, so this took me a while to work out. I also find the Clojure docs to be pretty impentrable. I'm sure they're great if one already knows what one is doing, but I found them pretty inaccessible from the perspective of a n00b. It's like if the PHP docs were solely the user-added stuff at the bottom of each docs page. Most blog articles I saw about Clojure were pretty much just direct regurgitation of the docs, without much value-add, if I'm to be honest.

(defrecord IndexedPerson [date name])

(def people-data (array-map "2008-11-08" "Jacinda" "1990-10-27" "Bill" "2014-09-20" "James" "1979-05-24" "Winston"))

(def people
  (reduce-kv
    (fn [people date name] (conj people (array-map date (IndexedPerson. date name))))
    (array-map)
    people-data))

(print people)

The other thing with Clojure for me is that the code is so alien-looking to me that I can't work out how to indent stuff to make the code clearer. All the examples I've seen don't seem very clear, and the indentation doesn't help either, I think. I guess with more practise it would come to me.

It seems pretty powerful though, cos there's mot much code there to achieve the desired end-goal.

Output for this one:

{2008-11-08 #user.IndexedPerson{:date 2008-11-08, :name Jacinda},
1990-10-27 #user.IndexedPerson{:date 1990-10-27, :name Bill},
2014-09-20 #user.IndexedPerson{:date 2014-09-20, :name James},
1979-05-24 #user.IndexedPerson{:date 1979-05-24, :name Winston}}


Summary

This was actually a very interesting exercise for me, and I learned stuff about all the languages concerned. Even PHP and CFML.

I twitterised a comment regarding how pleasing I found each solution:


This was before I did the Clojure one, and I'd slot that in afer CFML and before JS, making the list:
  1. Python
  2. Ruby
  3. Groovy
  4. CFML
  5. Clojure
  6. JS
  7. PHP
  8. Java

Python's code looks nice and it was easy to find out what to do. Same with Ruby, just not quite so much. And, really same with Groovy. I could order those three any way. I think Python tips the scales slightly with the comprehensions.

CFML came out suprisingly well in this, as it's a bloody easy exercise to achieve with it.

Clojure's fine, just a pain in the arse to understand what's going on, and the code looks a mess to me. But it does a lot in little space.

JS was disappointing because it wasn't nearly so easy as I expected it to be.

PHP is a mess.

And - fuck me - Java. Jesus.

My occasional reader Barry O'Sullivan volunteered some input the other day:


Hopefully he's still up for this, and I'll add it to the list so we can have a look at that code too.

Like I said before, if you know a better or more interesting way to do this in any of the languages above, or any other languages, make a comment and post a link to a Gist (just don't put the code inline in the comment please; it will not render at all well).

I might have another one of these exercises to do soon with another puzzle a friend of mine had to recently endure in a job-interview-related coding test. We'll see.

Righto.

--
Adam

Saturday, 30 January 2016

Appendix to previous article: Python examples of truthy/falsy

G'day:
All code and no narrative, this one. Here's the Python 3 equivalent of the Groovy code from the previous article (CFML (or probably LuceeLang) and what constitutes "The Truth").

print("none")
if None:
    print("truthy")
else:
    print("falsy")
print("=====================")


print("empty string")
if "":
    print("truthy")
else:
    print("falsy")
print("=====================")

print("non-empty string")
if "0":
    print("truthy")
else:
    print("falsy")
print("=====================")

print("zero")
if 0:
    print("truthy")
else:
    print("falsy")
print("=====================")

print("non-zero")
if -1:
    print("truthy")
else:
    print("falsy")
print("=====================")

print("empty list")
if []:
    print("truthy")
else:
    print("falsy")
print("=====================")


print("non-empty list")
if [None]:
    print("truthy")
else:
    print("falsy")
print("=====================")


print("empty dictionary")
if {}:
    print("truthy")
else:
    print("falsy")
print("=====================")

print("non-empty dictionary")
if {"key":None}:
    print("truthy")
else:
    print("falsy")
print("=====================")

class Test:
    def __init__(self, value):
        self.value = value

    def __bool__(self):
        return self.value == "truthy"

print("truthy object")
test = Test("truthy")
if test:
    print("truthy")
else:
    print("falsy")
print("=====================")

print("non-truthy object")
if Test("anything else"):
    print("truthy")
else:
    print("falsy")
print("=====================")

Output:

>python truthy.py
none
falsy
=====================
empty string
falsy
=====================
non-empty string
truthy
=====================
zero
falsy
=====================
non-zero
truthy
=====================
empty list
falsy
=====================
non-empty list
truthy
=====================
empty dictionary
falsy
=====================
non-empty dictionary
truthy
=====================
truthy object
truthy
=====================
non-truthy object
falsy
=====================

>

The most interesting thing in this is __init__ and __bool__? FFS. I've found a language even worse than PHP for its shit function names and underscore usage, it seems.

More Guinness, pls...

--
Adam

Tuesday, 19 January 2016

Floating point arithmetic with decimals

G'day:
As a human... what is the value of z, after you process this pseudocode with your wetware:

x = 17.76
y = 100
z = x * y

Hopefully you'd say "1776". It was not a trick question.

And that's an integer, right? Correct.

CFML

Now... try this CFML code:

x = 17.76;
y  = 100;
z = x*y;

writeOutput(z);

1776 So far so good.

But what about this:

writeOutput(isValid("integer", z));

You might think "YES" (or true if yer on Lucee), however it's "NO".

And this is where young players fall into the trap. They get all annoyed with isValid() getting it wrong, etc. Which, to be fair, is a reasonable assumption with isValid(), but it's not correct in this instance. It's the young player who is mistaken.

If we now do this:

writeOutput(z.getClass().getName());

We get: java.lang.Double

OK, but 1776 can be a Double, sure. But CFML should still consider a Double 1776 as a valid integer, as it should be able to be treated like one. So why doesn't it? What if we circumvent CFML, and go straight to Java:

writeOutput(z.toString());

1776.0000000000002

Boom. Floating point arithmetic inaccuracy.

Never ever ever forget, everyone... when you multiply floating point numbers with decimals... you will get "unexpected" (but you should pretty much expect it!) floating point accuracy issues. This is for the perennial reason that what's easy for us to express in decimal is actually quite hard for a computer to translate into binary accurately.

Aside: we were chatting about all this on the CFML Slack channel this morning, and one person asked "OK, so how come 17.75 x 100 works and 17.76 x 100 does not?". This is because a computer can represent 0.75 in binary exactly (2-1 + 2-2), whereas 0.76 can only be approximated, hence causing the "issue".

The problem really is that CFML should simply output 1776.0000000000002 when we ask it, and it should not try to be clever and hide this stuff. Because it's significant information. Then when the young player output the value, they'd go "oh yeah, better round that" or whatever they need to do before proceeding. CFML is not helping here.

This is pretty ubiquitous in programming. Let's have a trawl through the various languages I can write the simplest of code in:

JavaScript


x = 17.76;
y = 100;
z = x * y

console.log(z);


>node jsVersion.js
1776.0000000000002

>

JS just does what it's told. Unsurprisingly.

Groovy


x = 17.76
y = 100
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 groovyVersion.groovy

x * y: 1776.00
x: java.math.BigDecimal
y: java.lang.Integer
z: java.math.BigDecimal
z: 1776.00
>


This is interesting. Whilst Groovy keeps the result as a float (specifically a BigDecimal) - which is correct - it truncates it to the total number of decimal places expressed in its factors. That's how I was taught to do it in Physics at school, so I like this. This second example makes it more clear:

x = 3.30
y = 7.70
z = x * y
println "x * y: " + z

println "x: " + x.getClass().getName()
println "y: " + y.getClass().getName()
println "z: " + z.getClass().getName()
println "z: " + z.toString()


>groovy32 more.groovy
x * y: 25.4100
x: java.math.BigDecimal
y: java.math.BigDecimal
z: java.math.BigDecimal
z: 25.4100
>

In 3.30 and 7.70 there are four decimal places expressed (ie: two for each factor), so Groovy maintains that accuracy. Nice!


Java


import java.math.BigDecimal;

class JavaVersion {

    public static void main(String[] args){
        double x = 17.76;
        int y = 100;
        System.out.println(x*y);
        
        BigDecimal x2 = new BigDecimal(17.76);
        BigDecimal y2 = new BigDecimal(100);
        System.out.println(x2.multiply(y2));
        
    }
}

Here I added a different variation because I was trying to see why the Groovy code behaved the way it did, but it didn't answer my question. I suspected that perhaps it was a BigDecimal thing how it decided on the accuracy of the result, but it wasn't:


>java JavaVersion
1776.0000000000002
1776.000000000000156319401867222040891647338867187500

>

This is a good demonstration of how a simply base-10 decimal fraction is actually an irrational number in binary.

Monday, 24 November 2014

Weekend code puzzle: my answer (Python version)

G'day:
This is a companion exercise to my earlier articles:
I'll be using exactly the same logic as in those two, just working within Python's constraints. I freely admit to not knowing Python from a bar of soap - indeed this is the third bit of Python I have ever written - so I would not vouch for this being anything other than a comparison to the other two pieces of code, and not a demonstration of what a Python dev might consider "good code". This is not a Python tutorial.

Sunday, 23 November 2014

Weekend code puzzle: Dave's answer (Python)

G'day:
I'm continuing to look at each person's submissions for the code puzzle ("Something for the weekend? A wee code puzzle (in CFML, PHP, anything really...)").

Dave's done a Python version. Like Chris just before him, Dave got his answer in before I varied the rules slightly, so his answer just finds the first longest subseries within the threshold from within the series; it does not check same-lengthed subseries for which has the highest within-threshold total. "Within" three times in a sentence. Sorry about that.

Sunday, 5 October 2014

PHP: include paths are relative to the current working directory

G'day:
This one had me confused for a day or so last week. It seems the relative paths in include / require calls in PHP are relative to the current working directory, not the file the include statement is actually in. I'm not sure I agree with this.