13 April 2008

Microsoft STL Performance

On the Boost discussion group there is a discussion started about High Cost of MS "Safe" STL for Release Builds. It is an interesting look at the massive differences between "safe" and "non-safe" options in Microsoft STL.

I don't think there has been a clear look at the efficiencies and inefficiencies in the MS-STL implementation, especially with the different options (iterator debugging and safe options). There have been some looks at comparing different STL implementations but it always difficult to do a good comparison, the main problem being comparing the correct "latest versions".

Going back a while I was justifying the use of STLPort instead of Microsoft STL supplied with Visual Studio. Simply, it was a performance thing with both memory and efficiency. Certainly with the new STLPort visualisers for the Microsoft debugger it is an even playing field for ease of use.

You can read those here:
* Visual Studio 2005 - Lets Break Everything! - a bit about how to switch off some of the more annoying "features".
* Development - STLPort versus Microsoft STL performance - A quick summary of my observations on the performance in a real-world complex application.
* More on STLPort and Microsoft STL performance - A little more background.

11 April 2008

Sharing Is Good - The Open-Source Insomniac

Insomniac Games who are behind Ratchet & Clank and Resistance: Fall Of Man have recently decided to start sharing knowledge. Gaming is a notoriously secretive industry and is not well renowned for its open-ness, and Insomniac think its mad that developers keep reinvenhttp://www.blogger.com/img/gl.link.gifting the same core pieces of functionality that don't even make up the bread'n'butter of gaming.

First up is the R&D section on their website. This section contains presentations and papers about various subjects including graphics, gameplay, memory and performance.

Next up is the open-source BSD-style licensed code. The section is called the Nocturnal Initiative and is a wiki. There are links to community forums which aren't busy yet.

At the moment some of the source code available is:
  • C++ Delegate/Event System
  • Pointer to reference counted objects
  • Pointer to heap allocated arrays
  • Endian conversion code
  • Insertion ordered std::set
  • Reversible key and value std::map
  • Interprocess Communication
    • Fast non-blocking message-based design
    • Works over BSD-style sockets (TCP) or Named Pipes
    • Windows platform included, additional platforms are easily added
  • Debugging Helpers
    • PDB-based symbol information querying
    • Capture stacks within your program making heap object tracking and leak debugging easier
    • Give your application automatic crash reports dispatched via email that contain handy user/machine informaton, call stack, and memory page allocation stats
  • Console Output Manager
    • Log console output to one or more trace files
    • Color code the console output based on Error/Warning/Debug print statements
    • Throttle output verbosity (configured via command line arg or environment variable)
    • Outline nested stages of processing performed by your application (for builders/exporters)
    • Augment crash reports with the current outline state of your application (very useful for tracking down new crash bugs)
  • Instrumenting Profiler
    • Cross platform (Windows and PS3 currently)
    • Macro-instrumented stack timer based profiler
    • Concise profile report printed out at program exit
    • Logs instance data out to human readable log file (Profile Analyzer is in development)
  • C++ Reflection
    • Instrument your application classes and register them with the type registry
    • Can serialize object instances to XML or our (faster) custom binary format
    • Flexible parsing mechanics allow you to read in old versions of your class
    • Handles renaming member variables as well as changing member type (within reasonable limits)
    • Supports serializing std::vector, std::set, and std::map containers with primitives or pointers to other reflect objects
    • Supports serializing enum and bitfield members using string representation (supports reordering enum elements)
    • Provides for automatic object comparison and cloning
    • Implements introspection using a visitor interface

Some of it looks pretty good. I hope this means that there is more sharing of C++ code which is always a good thing, maybe even some of the libraries could be Boost-ified...


In a few previous posts I have mentioned some alternative STL implementations. You can read about rdestl here as well as uSTL and stdcxx here.

I've stumbled across a couple of other STL implementations for more specific purposes that I thought some people might find useful.

First up is MCSTL - The Multi-Core Standard Template Library which is a multi-core implementation of certain STL algorithms. This has actually been integrated into the GCC STL implementation with version 4.3. It uses OpenMP internally for the multi-threading so would be limited to compilers with valid implementations of that functionality.

Next up there is STXXL: Standard Template Library for Extra Large Data Sets. I think the blurb sums it up:
The core of STXXL is an implementation of the C++ standard template library STL for external memory (out-of-core) computations, i.e., STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks. While the compatibility to the STL supports ease of use and compatibility with existing applications, another design priority is high performance.
After seeing some extra long and large computations on huge data sets this can be used to get around limitations of the platform with less addressable space.

10 April 2008

Visual C++ 2008 Feature Pack Released

Good news for Windows C++ developers the update to Visual Studio 2008 has been released. This "Feature Pack" contains some of the new TR1 C++ standard library as well as a major MFC update.

Details from the Visual C++ Team Weblog can be found here with some videos and links.

TR1 Update

The TR1 update is an integration of some more Dinkumware library functionality. The features available are:
* array - Defines the container template class array and several supporting templates.
* functional - Defines several templates that help construct function objects, which are objects of a type that defines operator(). A function object can be a function pointer, but more typically, the object is used to store additional information that can be accessed during a function call.
* memory - Defines a class, an operator, and several templates that help allocate and free objects.
* random - Defines many random number generators.
* regex - Defines a template class to parse regular expressions, and several template classes and functions to search text for matches to a regular expression object.
* tuple - Defines a template tuple Class whose instances hold objects of varying types.
* type_traits - Defines templates that provide compile-time constants that give information about the properties of their type arguments.
* unordered_map - Defines the container template classes unordered_map and unordered_multimap and their supporting templates.
* unordered_set - Defines the container template classes unordered_multiset and unordered_set and their supporting templates.
* utility - Defines several general templates that can be used throughout the Standard Template Library.

MFC Update

The MFC Update integrates BCGSoft's libraries into the base MFC and provides masses of useful user interface constructs for native developers. Details from the Visual C++ Weblog is here.

Some of the new features are:
* Office 2007 Ribbon Bar: Ribbon, Pearl, Quick Access Toolbar, Status Bar, etc.
* Office 2003 and XP look: Office-style toolbars and menus, Outlook-style shortcut bar, print preview, live font picker, color picker, etc.
* Visual Studio look: sophisticated docking functionality, auto hide windows, property grids, MDI tabs, tab groups, etc.
* Internet Explorer look: Rebars and task panes.
* Vista theme support.
* “On the fly” menus and toolbar customization: users can customize the running application through live drag and drop of menu items and toolbar buttons.
* Shell management classes: use these classes to enumerate folders, drives and items, browse for folders and more.

You can download all this from here.

Git For Windows - msysgit

Today I finally got a chance to install msysgit - the Git port for Windows using MinGW.

I was extremely impressed with the painless install and the ease of integration (into the shell context menu and the commandline). The installer size is a trim 8Mb so it is a quick download to try out.

The main thing I have tried out at the moment is the newly functioning (for Windows) Subversion bridge. The git-svn import seems fairly speedy even on Windows and works seamlessly. It is interesting how efficiently the data gets stored inteh repository as well, and the possibility of reconstructing information that has been lost due to file moves and other lossy operations in Subversion.

Git-GUI also works. Obviously it does not look spiffy and shiny but presents the information you need, and access to the operations you need (on a basic level).

I would say Git For Windows is very close to being "ready" and providing you are not in need of the more difficult corner cases it is ready for production use. The guys working on it have done a great job.

Now all we need is their version of TortoiseSVN to take off called git-cheetah (which probably is getting easier thanks to some of the Tortoises sharing code now to do with displaying overlays).

Git Is The New Unix

There is a great article about what Git really means as a platform. You can read it here.

This kind of opened my eyes to how Git differs from not only other source control systems, but other distributed source control systems. I recommend giving it a quick read.

GTK+ The Future

Lots of information has been revealed recently about possible future directions for GTK+. There is a comprehensive article on ArsTechnica here.

Imendio, one of the active developers of GTK+ made a recent presentation you can see here. This outlines what they see as the future direction and release plans for GTK+ in the future, with more detail available in the position document. Essentially it involves taking it forward by having a clear roadmap to break the ABI which has been stable for a long time, but has limited leaps forward in development.

Havoc Pennington had a proposal also, this is available here. His proposal did not necessarily overlap the Imendio one but involved using a scenegraph API for defining the UI. He also suggests looking at the OpenGL based Clutter which uses a GObject API for its rich user interface experience.

Some additional experimentation has been made with integrating OpenGL with GTK+. A lot of work has been done to try and keep GTK+ up to date with theming and transparency but it can look off the pace especially when attempting to mimic the look of a native platform. Hopefully all these changes will improve the situation on push the GUI even further since Windows development tools seem to be pushing the "every application looks different" paradigm, Apple making iTunes look the same on all platforms and applications like Songbird using the same theme on all platforms (and still looking good).

05 April 2008

ACCU 2008 Conference

Disclaimer: These are my interpretations of what I learnt from the talks rather than a transcription of what they said. This means that I probably misheard and misinterpreted some parts which may be hazardous to your health.

DisplayLink were very generous and allowed me to attend the ACCU 2008 conference this year. I chose to go for Wednesday and Thursday's talkhttp://www.blogger.com/img/gl.link.gifs, but next time I plan to attend the whole conference. I thought I would present some thoughts on those talks I did attend.

Overall it was well worth spending the time at the conference, meeting a variety of interesting people. I know people always say you learn more in the bar afterwards but I would say there would have to be some pretty intense knowledge exchanges to beat the information I picked up over the two days.

Value Delivery For Agile Environments
Tom Gilb

I can sum up the talk in three words "Measure Measure Measure". Tom Gilb used his keynote to explain EVO, an envelope framework to surround a smaller development-centric process, which was in this case Agile. He sees Agile as deficient in that it is a development process geared for delivery, but less thought is put into what you actually deliver.

The problem comes then with what do you measure, how you measure, and then how do you interpret those metrics. By doing this and combining it with a fast deliverable methodology like Agile then you end up with constant iterations with feedback able to deal with the changing nature of the world (most probably defined by requirements).

I felt that the talk had an implicit feeling of "How To Survive". You need to identify your stakeholders, the people that determine the success and failure of your project and make sure that the needs of the most important and influential ones are met. If they like what you are doing by meeting and possibly exceeding their needs then you are more likely to gain extra resourcing as you are then seen as a successful group.

Bits And Mortar
Ric Parkin

Like an extended episode of Grand Designs we were taken through an analysis of buildings. Well, no not really, but some of the theories that we are using in computer science have been looked at before and not only recently, in a completely different problem domain, and this field is architecture (and the evolution of buildings).

I'm not the world's biggest fan of analogies because pedantics always want to poke holes in it and take it off course thus negating any benefit from using it. I wish I never used analogies but I am like a lemming following everyone else. Luckily there was a thoughtful audience and the core was suitably abstract to avoid those problems.

The basis of the talk was the work of an architect Christopher Alexander. He posited the theory of patterns which obviously directly relates to what engineers are doing right now, and in fact sometimes he doesn't even refer to architecture and buildings.

Due to my ignorance all of this was completely new to me, and I could take a lot away from the talk because a lot of the ideas of the evolution of a building (and therefore design) is directly applicable to the realm of computer science. If you look at a building as a finished intransient product after it has been completed then you forget about the lifetime of the building and how it evolves much like a codebase. Knowing when to rebuild or rip-down parts requires suitable knowledge of what you are doing and you can also apply patterns other people have proved to be successful subsequently.

Unfortunately (for Ric) the talk will be forever remembered as the place he uttered in public "I don't mind introducing bugs".

Practical Multi-Threading
Dietmar Kuehl

This talk covered the basics of the new C++ standard. It was a packed room so a lot of people are interested in this area.

It certainly looks like writing multithreaded applications will be much less code in C++ than it has traditionally been. I like the idea of getting more functionality for less code. Items like condition variables will be supported in the C++ Standard Library. There was also a brief part about some of the TR2 features (C++0x + 1) like futures which makes using concurrent processing of independent blocks of code even easier and simplifying the synchronisation.

Also some of this will be helped even more by the lambda expressions as I can see some of the simple operations can be kicked off and calculated independently on a single line.

Also there was some coverage of Intel's Threading Building Blocks which provides concurrent containers and also concurrent algorithms like parallel_for or parallel_reduce. This all provides some higher level semantics for expressing the concepts of multiple threaded processing.

When Good Architectures Go Bad
Mark Dalgarno

This was more of an interactive session where people's experiences fed directly into the talk, so it means each time you would hear a write-up about it there would be a different opinion. Luckily my group had some interesting anecdotes. I do wonder why we all stay working in computers if we suffer this much abuse(!)

We used our experiences of the world to come up with examples of where the architecture had begun to "smell" and what this represented. Then looking at case studies we attempted to identify and find potential solutions to eliminate these smells. For my example of a system that had been going for a very long time through so many different platforms, teams, languages, I said to cancel it because it was not making enough money to warrant its existence.

The most frightening solution to architectural decay which also came up at the SPA conference when Mark did it previously, was "Kill The Architect". I thought I was cynical(!)

This was a talk where you got more out of it if you put more into it. Hopefully Mark will put up some of the responses he got from the audience on his blog. In fact he could probably write a very frightening book about it.

Caging The Effects Monster: The Next Decade's Big Challenge
Simon Peyton-Jones

Unfortunately due to beer-and-lack-of-sleep-related circumstances I decided to close my eyes in a darkened room for the duration of this talk. I did not feel too guilty as I had seen Simon's talks for the BCS SPA Cambridge meetings. I posted about that talk here.

The functional programming track at the ACCU Conference seemed to be really well attended and seems to have grabbed the imagination of a number of people.

The Future Of Concurrency In C++
Anthony Williams

This talk was hosted by the maintainer of the Boost.Threads library. He went through some of the more complex parts of the upcoming C++0x and C++0x TR2 as well as what is available through Boost.Threads currently.

One of my favourite parts of the entire thing is the concept of thread-local storage as a built-in keyword. No more GetTLS and the suchlike. I could immediately see a use of a static member of a class that is per-thread in order to create a memory allocator for STL containers which would allocate via only the thread's heap. If you know that some information is local to a single thread then you won't have any memory contention (in the program - I am not thinking about the hardware or underlying implementation) to slow down the memory access. You have to have a clear design and use of this though otherwise you could blow your program's brains out, but also that design works very nicely with thread pools...

Unfortunately some of the higher level concepts will take until probably the next standard TR2 to get to compilers. Of note this will contain thread pools and futures. Futures mean you can run a thread for a calculation and start it off in a single statement and thehttp://www.blogger.com/img/gl.link.gifn check the result after doing some more work and it will wait until the result is posted. Also getting the result will then propogate any exceptions that had occurred on the calculation thread.

The good thing is a large portion of the C++0x threading implementation is available through Boost.Threads thanks to Anthony's sterling efforts for 1.35, so you can always have a play.

Adobe Source Libraries : Overview And Philosophy
Sean Parent
Adobe Source Libraries (ASL) Open-Source
Lots of interesting papers and articles
Alexander Stepanov's collection of papers, articles and presentations

I can think of worse things to be remembered for but I hope that Sean Parent is not only thought of as Alexander Stepanov's boss. He heads up the Software Technologies Lab at Adobe that create generic libraries that are used in all Adobe's products. This talk was divided into two sections, one concentrating on the data structures and generic programming, and the other about declarative UI.

A very interesting part of the talk is the way he said he was using Alexander Stepanov's skills, he basically said "Write a book defining generic programming". The research for this has lead to lots of leaps forwards for Adobe's programming technologies. Also he said that Stepanov gives lectures to the Adobe programming staff to improve their education.

This talk started a little above my head by defining Regular Types which is very similar to the definitions and rules for what can be derived for functional programming except you can have side-effects. This then provides the basis for generic programming.

We had a close look at the ideas behind the Move library currently maintained by Adobe but could go (back) into Boost. This library uses Return Value Optimisation to minimise and eliminate copies, and I was surprised to learn this was through the use of passing by value rather than by reference.

There was also a look at the copy_on_write functionality which meant the object is only copied when it is written to and it uses the move library as part of its basis. This also provides the platform for Adobe's history tool in Photoshop and minimises memory impact. Then there was a look at the Forest container which approaches the binary tree in a very tidy way. Also they have a string library that uses the move library so concatenations are much more efficient.

They are the right tools to solve certain datatype problems in a very concise and efficient fashion. But the idea behind it all is to only use small pieces of code as building blocks towards the larger solutions. What they want to do is reduce the number of lines of code declaring the applications by a very large factor (like from 3 million to 30,000).

The second half of the talk was based around declarative UI and the structures Adobe have put in place to solve problems that still exist to this day. I've looked at the two main libraries that make this up before called Adam and Eve, where both really amount to being constraint solvers, one solving the data and one solving the layout.

The layout library is probably the simplest to explain as it works out from the size algorithms you provide the layout that follows the guides that have been set up. This also means that it can scale the layouts with relative ease. The scripting language is used to define the layout called Eve.

The property library is used to solve data dependencies (a lot of which are typically cyclic) for user interfaces. I have used something similar but much more simplistic in the past because it keeps the data separate from the UI itself. This also means that you can script the operations.

The papers and documentation on the Adobe site can probably explain all this much better then I ever could, but they are al interesting building blocks for solutions to some overlooked problems.

Overall I enjoyed the talk especially the first half as it was a bit eye opening their practical approach to implementing generic programming with real benefits. Unfortunately the talk was really badly attended considering both halves were up against some tough competition (especially about functional programming), but hey they all missed out on some really good stuff.