Phalanger
From PhalangerWiki
Phalanger is a new PHP implementation introducing the PHP language into the family of compiled .NET languages. It provides PHP applications an execution environment that is fast and extremely compatible with the vast array of existing PHP code. Phalanger gives web-application developers the ability to benefit from both the ease-of-use and effectiveness of the PHP language and the power and richness of the .NET platform taking profit from the best from both sides.
Other topics:
- Security - Security advantages of Phalanger powered PHP applications.
- Phalanger use cases - Possible important and interesting usages of Phalanger.
Contents |
PHP Language Compilation
The Phalanger compiler will compile the PHP language into the MSIL byte-code. It is thus the front-end of the compilation while the back-end is provided by the JITter (Just-In-Time compiler) which is a part of the .NET framework. That's why we do not address the native code generation nor optimization. Our task is to compile the PHP scripts into .NET assemblies - logical units containing MSIL code and meta-data.
To be able to reuse compiled code as much as possible, it is necessary to compile individual PHP scripts separately. This is in contrast to the behavior of PHP interpreter: when a PHP script references another (for example using the include construct), the interpreter performs actions very similar to actual source code copy-paste inclusion at that place. This means that code sharing involves lot of code reinterpretation unless it is cached and optimized by some additional third party tool. That's why the Phalanger compiler compiles individual files to unilaterally independent modules (an included script is not dependent on the including one) to enable linking of the resulting modules without the need of recompilation. This approach requires careful treatment of the compiled form of scripts which may or may not enable such optimization.
The PHP language contains some constructs having interpreted character. These constructs have to be handled in compile-time so that the resulting code would do the same but effectively. Some constructs may be implemented using parts of compile-time symbol-tables emitted into the resulting code; some require execution of code which is unknown in the compile time (e.g. the eval construct). The latter ones require the execution of a compiler to generate and run the MSIL code at run-time (run-time code generation is supported by the Framework thanks to the JIT compilation, although there are some drawbacks which we also discuss in this paper. It is important to notice that an experienced programmer does not need to use such constructs very often because in most cases it is possible to reach the same effect using a "cleaner" technique. It is thus not necessary to compile such constructs more effectively than other ones.
The Class Library
PHP contains several hundreds of functions available to PHP script programmers. These may be divided into these groups:
- PHP language functions (constructs) working directly with variables, functions, objects etc. (e.g. eval, assert, include, list etc.)
- Built-in functions for string and array manipulations, mathematical functions, regular expressions, file access etc.
- External functions contained in PHP extensions used for example for database access, image manipulation etc.
Language constructs are implemented by the compiler including those having a function-call syntax but treating its arguments in a special manner (such as array, list etc.). There are no corresponding functions in the Phalanger Class Library. Built-in functions are implemented in the Phalanger Class Library, which is written entirely in C# and providing this functionality via classes (for example PHP.PhpStrings, PHP.PhpArrays etc.) and their static methods. The library is designed to offer its functionality not only to the compiled PHP code but also encourage reuse in any other .NET Framework application (written in C#, J#, VB.NET etc.). Moreover, respecting some simple rules, a programmer can extend the Class Library with its own functions written in an arbitrary .NET language.
External functions are implemented in the PHP as dynamically linked libraries (.dll on Microsoft Windows platform). These libraries are loaded to the PHP interpreter address space and communicate with PHP using a predefined set of functions (called Zend API). Typical PHP installation contains about 50 such libraries and the provided functions are numerous. It is possible to implement any of these libraries in C# and add it to the Class Library (as stated above) but it is impossible to implement all of them because the number of such libraries is not limited by the PHP distribution. That's why we decided to use the existing .dll libraries and create a component enabling to call the contained functions directly from .NET and thus both C# and the compiled PHP code.
There are two means of loading PHP extensions: collocated or isolated. The web server administrator may configure individual extensions depending on their reliability preferring either performance or safety:
- Trusted extensions may be collocated in the main process address space in the Phalanger AppDomain which leads to 5 to 10 times faster execution times.
- Untrusted extensions may be loaded to the address space of an isolated process (ExtManager) which protects the main process from unmanaged exceptions.
The isolated dynamic libraries are loaded into the address space of the ExtManager programmed in MC++ (C++ with managed extensions) which communicates with the ASP.NET server using the .NET Remoting. The communication is held via shared memory managed by our remoting channel called ShmChannel. This channel is a part of the Phalanger project but may be used independently in any other .NET application as a faster alternative to the TcpChannel and HttpChannel shipped with .NET Framework. The ExtManager simulates the PHP interpreter environment to the hosted libraries so that their functionality is the same as in PHP. Since the extension libraries contain an unmanaged (native) code they may cause exceptions leading to the termination of the calling process. ASP.NET server then executes a new process of ExtManager without threatening the server process itself in such case.
The server administrator installing Phalanger and deciding which libraries are available for the page developers will have the tools necessary to install any library implementing the interface designed by the PHP authors. This installation is performed by a standalone application shipped with Phalanger. A code encapsulating the given library is generated. We call it a managed wrapper of the extension. It gives access to the functions contained in the extension from the managed environment (for example C#) via the ExtManager. This wrapper contains definitions (stubs) of the encapsulated functions together with the type information. Since there is no such information (regarding function arguments or return value types) in the dynamic libraries, it has to be provided by user in XML format. These type-information files form a part of the Phalanger installation for the most commonly used extensions together with the compiled wrappers for these extensions.
ASP.NET cooperation
The cooperation is enabled by an object implementing an interface for responding requests sent by clients to a web-server (a.k.a. HTTP Handler). A request is received by the IIS (Internet Information Server) which should be configured to pass .php page requests to the ASP.NET server. This server (process) is a host of .NET Framework applications. There are several logical spaces in its address space called AppDomains each created to serve requests for one web application (not only Phalanger ones). When a request to a Phalanger application comes to the server an object called request handler is created and starts to process the request. At first it checks for the presence of a cached compilation of the requested script. If the compiled code is not found a compiler is executed to create it and store it into the cache. The compiled code is then instantiated and executed within a script context which is also created per request. It contains run-time modifiable configuration and other stuff valid while the request is processing.
ASP.NET applications are configured using XML configuration files. The main file (Machine.config) is stored in the .NET Framework directory. The individual web-applications are configured using Web.config files stored in the respective directories with the configuration being inherited through the directory hierarchy. This is used also in the Phalanger configuration. In the typical installation, there is a Phalanger virtual directory in the root directory of the web-server. The main configuration file (with options similar to the php.ini file) will be the Web.config in the Phalanger directory. This file is managed by the web-server administrator who will create subdirectories for individual web-applications enabling them to reconfigure some settings using the corresponding Web.config files.
Requirements
The Phalanger is a Microsoft .NET Framework application. It requires the Microsoft .NET Framework 2.0 or 4.0 with latest Service Packs and Microsoft Windows XP/2003/2008/Vista/7 operating system.
Integration with ASP.NET is the principal feature of Phalanger. This requires Internet Information Services (IIS) version 6 or newer with ASP.NET installed. The ASP.NET is installed automatically for IIS-enabled systems with the .NET Framework setup.
