Monday 3 August 2015

PHP: Has PHP7 implemented function return-type checking syntax poorly?

G'day:
I had a look at PHP 7's new (but very late tot he party) capability to type-check function return types the other day ("PHP 7: functions can now have type-checking on their return values"). The functionality kinda works OK, but I think the syntax is off:

[access modifier] [static modifier] functionName([[type] argumentName[=defaultValue]][,...])[ : return type] {
    // implementation
}

EG:


public static function getFullName(string $firstName, string $lastName) : string {
    return "$firstName $lastName";
}

The problem being that whilst all the other function modifiers are before the function name: the return type modifier is right at the end of the function signature.

This in itself is only slightly crap, but when one considers the type modifiers on arguments are before the argument name, it then just starts being a sloppy implementation.

However I wondered if there was a precedent in other popular languages I was unaware of in which this is actually something that's done, and I just hadn't noticed. So I decided to check. I've gone through what's considered the ten most popular languages and looked at what their syntax is for argument and return type checking.



LanguageReturn typeArgument typeExampleObservation
JavaBeforeBeforeint f(int x)
CBeforeBeforeint f(int x)
C++BeforeBeforeint f(int x)
C#BeforeBeforeint f(int x)
PythonAfterAfterdef f(x: Integer) -> IntegerThis is only metadata, not a formal type check
Objective-CBeforeBefore- (int)f:(int)x
Visual BasicAfterAfterFunction f(x as Integer) as Integer
JavaScript--function f(x)Does not support type-checking
Ruby--def f(x)Does not support type-checking
ActionScriptAfterAfterfunction f(x:int):int
GroovyBeforeBeforeint f(int x)
ClojureBeforeBefore(defn ^Integer f[^Integer x])
GoAfterAfterf(x int) int
CFMLBeforeBeforenumeric function f(numeric x)
PHPAfterBeforefunction f(int x) : intOnly surveyed language to mix the two

I included CFML in there not because I'm claiming it's popular, but because it's a very similar language to PHP, and is where my baseline expectations come from.

Right, so PHP is an outlier here. No other language I checked mixed the position of its type modifiers, let along mixed the position of its other modifiers (eg: access and static). My conclusion therefore is that it's pretty much a bad approach.

(BTW: I'm dead keen to hear about languages that do do this mix-up like PHP does. My stats gathering was not very comprehensive here, I know).

The rationale for doing it the way PHP 7 has is actually in the RFC ("PHP RFC: Return Type Declarations"):

Position of Type Declaration

The two major conventions in other programming languages for placing return type information are:
  • Before the function name
  • After the parameter list's closing parenthesis
The former position has been proposed in the past and the RFCs were either declined or withdrawn. One cited issue is that many developers wanted to preserve the ability to search for function foo to be able to find the definition for foo. A recent discussion about removing the function keyword has several comments that re-emphasized the value in preserving this.

The latter position is used in several languages); notably C++11 also places the return type after the parameter lists for certain constructs such as lambdas and auto-deducing return types.

Declaring the return type after the parameter list had no shift/reduce conflicts in the parser.

I don't believe the PHP language designers have thought this through thoroughly, and their rationalisations are poor.

  • I didn't find too many languages that had both type-checking and required the function keyword, but they type comes before the function keyword, not after. So this is not a valid consideration.
  • C++ might specify the type at the end of the signature for lambdas, but these are not lambdas. Lambdas are expressions. This is a function declaration statement. This is comparing apples to orangutans. And when it comes to function declaration statements, C++ does not put the return type at the end.

What about these other languages they speak of? They mention:


  • Hack - yep: this does actually have argument types specified before the argument, and the function return type after the function.
  • Haskell - yep. Given the syntax is sooo different to how PHP works, I wonder if this is really relevant? They want to be finding precedents set in "curly-brace" procedural or object-oriented languages, not FP ones which have no real syntactic similarity.
  • Go - yes, but it also specifies the argument type after the argument too, so not really the same as PHP.
  • Erlang - yes, but as with Go.
  • ActionScript - as with Go and Erlang.
  • TypeScript - as with Go, Erlang and ActionScript.

I concede there is a slight precedent set by Hack (but serious... Hack? Who cares what Hack does?), but I don't think the comparative precedent really stands up to too much logical scrutiny in the other examples cited.

I think PHP has got this one wrong. What's more their rationale for not putting the return type at the front are logically flawed, and pretty much invalidate the decision to put it at the end.

Hopefully I can get this info in front of some of the decision makers on the PHP project, and get it revisited before PHP 7 goes out the door.

Thoughts?

--
Adam