Icon Unrolling Rotations

 

Icon Animation Blend Spaces without Triangulation

 

Icon Quaternion Weighted Average

 

Icon BVHView

 

Icon Dead Blending Node in Unreal Engine

 

Icon Propagating Velocities through Animation Systems

 

Icon Cubic Interpolation of Quaternions

 

Icon Dead Blending

 

Icon Perfect Tracking with Springs

 

Icon Creating Looping Animations from Motion Capture

 

Icon My Favourite Things

 

Icon Inertialization Transition Cost

 

Icon Scalar Velocity

 

Icon Tags, Ranges and Masks

 

Icon Fitting Code Driven Displacement

 

Icon atoi and Trillions of Whales

 

Icon SuperTrack: Motion Tracking for Physically Simulated Characters using Supervised Learning

 

Icon Joint Limits

 

Icon Code vs Data Driven Displacement

 

Icon Exponential Map, Angle Axis, and Angular Velocity

 

Icon Encoding Events for Neural Networks

 

Icon Visualizing Rotation Spaces

 

Icon Spring-It-On: The Game Developer's Spring-Roll-Call

 

Icon Interviewing Advice from the Other Side of the Table

 

Icon Saguaro

 

Icon Learned Motion Matching

 

Icon Why Can't I Reproduce Their Results?

 

Icon Latinendian vs Arabendian

 

Icon Machine Learning, Kolmogorov Complexity, and Squishy Bunnies

 

Icon Subspace Neural Physics: Fast Data-Driven Interactive Simulation

 

Icon Software for Rent

 

Icon Naraleian Caterpillars

 

Icon The Scientific Method is a Virus

 

Icon Local Minima, Saddle Points, and Plateaus

 

Icon Robust Solving of Optical Motion Capture Data by Denoising

 

Icon Simple Concurrency in Python

 

Icon The Software Thief

 

Icon ASCII : A Love Letter

 

Icon My Neural Network isn't working! What should I do?

 

Icon Phase-Functioned Neural Networks for Character Control

 

Icon 17 Line Markov Chain

 

Icon 14 Character Random Number Generator

 

Icon Simple Two Joint IK

 

Icon Generating Icons with Pixel Sorting

 

Icon Neural Network Ambient Occlusion

 

Icon Three Short Stories about the East Coast Main Line

 

Icon The New Alphabet

 

Icon "The Color Munifni Exists"

 

Icon A Deep Learning Framework For Character Motion Synthesis and Editing

 

Icon The Halting Problem and The Moral Arbitrator

 

Icon The Witness

 

Icon Four Seasons Crisp Omelette

 

Icon At the Bottom of the Elevator

 

Icon Tracing Functions in Python

 

Icon Still Things and Moving Things

 

Icon water.cpp

 

Icon Making Poetry in Piet

 

Icon Learning Motion Manifolds with Convolutional Autoencoders

 

Icon Learning an Inverse Rig Mapping for Character Animation

 

Icon Infinity Doesn't Exist

 

Icon Polyconf

 

Icon Raleigh

 

Icon The Skagerrak

 

Icon Printing a Stack Trace with MinGW

 

Icon The Border Pines

 

Icon You could have invented Parser Combinators

 

Icon Ready for the Fight

 

Icon Earthbound

 

Icon Turing Drawings

 

Icon Lost Child Announcement

 

Icon Shelter

 

Icon Data Science, how hard can it be?

 

Icon Denki Furo

 

Icon In Defence of the Unitype

 

Icon Maya Velocity Node

 

Icon Sandy Denny

 

Icon What type of Machine is the C Preprocessor?

 

Icon Which AI is more human?

 

Icon Gone Home

 

Icon Thoughts on Japan

 

Icon Can Computers Think?

 

Icon Counting Sheep & Infinity

 

Icon How Nature Builds Computers

 

Icon Painkillers

 

Icon Correct Box Sphere Intersection

 

Icon Avoiding Shader Conditionals

 

Icon Writing Portable OpenGL

 

Icon The Only Cable Car in Ireland

 

Icon Is the C Preprocessor Turing Complete?

 

Icon The aesthetics of code

 

Icon Issues with SDL on iOS and Android

 

Icon How I learned to stop worrying and love statistics

 

Icon PyMark

 

Icon AutoC Tools

 

Icon Scripting xNormal with Python

 

Icon Six Myths About Ray Tracing

 

Icon The Web Giants Will Fall

 

Icon PyAutoC

 

Icon The Pirate Song

 

Icon Dear Esther

 

Icon Unsharp Anti Aliasing

 

Icon The First Boy

 

Icon Parallel programming isn't hard, optimisation is.

 

Icon Skyrim

 

Icon Recognizing a language is solving a problem

 

Icon Could an animal learn to program?

 

Icon RAGE

 

Icon Pure Depth SSAO

 

Icon Synchronized in Python

 

Icon 3d Printing

 

Icon Real Time Graphics is Virtual Reality

 

Icon Painting Style Renderer

 

Icon A very hard problem

 

Icon Indie Development vs Modding

 

Icon Corange

 

Icon 3ds Max PLY Exporter

 

Icon A Case for the Technical Artist

 

Icon Enums

 

Icon Scorpions have won evolution

 

Icon Dirt and Ashes

 

Icon Lazy Python

 

Icon Subdivision Modelling

 

Icon The Owl

 

Icon Mouse Traps

 

Icon Updated Art Reel

 

Icon Tech Reel

 

Icon Graphics Aren't the Enemy

 

Icon On Being A Games Artist

 

Icon The Bluebird

 

Icon Everything2

 

Icon Duck Engine

 

Icon Boarding Preview

 

Icon Sailing Preview

 

Icon Exodus Village Flyover

 

Icon Art Reel

 

Icon LOL I DREW THIS DRAGON

 

Icon One Cat Just Leads To Another

Parallel programming isn't hard, optimisation is.

Created on Jan. 17, 2012, 2:31 p.m.


I'm standing in the living room and I ask my flatmates if any of them want to go to the pub. I then count the positive replies. I've just computed something in parallel. An exact instance of map-reduce. And it comes as naturally (if not more naturally) to me as the sequential approach.

We have the human intuition and analogy. Almost all parallel problems can be expressed as "Don't touch that while I'm doing something with it", or "Everyone do something and then let me gather the results". The only remaining hard part is the locality information. For example a particle system where a single particle is only affected by the nearest ten other particles. And then, most locality instances can be expressed simply by how the data is partitioned in map reduce.

Parallel concepts don't need to be hard, but the great contradiction is that whenever we are concerned with making processing parallel, we are also concerned about performance - and reasoning about the performance of parallel computation is hard. Any kind of sequential assertions you make can be thrown out the window. Suddenly the hardware and cache become a major issue, and what was before a relatively small search space, comprising of special data structures, C tricks and inline assembly, has become this huge hulking monster of possibilities.

I've been working on using OpenCL to do non-conventional rendering techniques and my exploration has lead me toward needing to implement an instance of the Marching Cubes algorithm. Something I've done before. What has me tied up this time is the vast number of possible parallel approaches at arms reach. I can at least vaguely reason about how difficult the various approaches are to implement, but when it comes to predicting the expense of them I'm somewhat in the dark. Big O notation can't help me here.

To give you an idea, one approach is a to fit marching cubes into a map reduce pattern - which is temping for it's conceptual simplicity but unfortunately reduction can't be completely automated in OpenCL and would require a variable number of reduction passes depending on the data size. It ends up more complicated than it seems. Another option is to essentially put a lock around two buffers and fill them up as workers finish, but I have no idea how much congestion this would cause. I could even do most of the computation in a very fast, single dispatch, but then I would need a final clean-up pass which would have to run sequentially and may end up very expensive. None of the approaches are really trivial enough to just mock up and test out, and I can't find a whole bunch of testimonies either.

Until programmers can begin to reason about the expense of parallel programming without resorting to benchmarks, it is always going to be somewhat of a struggle for them to adopt it. This is something that hardware developers, language developers and academia all need to work on together.

For a great lecture on the subject and fascinating delve into the hardware take a look at this by Bill Dally.

github twitter rss