Overview Current research interests include applications of statistical learning and computer vision in the arts. 


Projects 


Universal Image Generator:
Hostadter posed the question, "Is there a single algorithm
that can generate all possible typefaces". Here we address the generalized
question, is there a single algorithm than can generate all possible images?




Direct Manipulation Blendshapes:
Direct manipulation for figure animation has long
been possible using inverse kinematics.
This work describes an easytoimplement direct
manipulation approach for the popular blendshape facial
representation. It also interoperates with traditional
blendshape parameter (slider) editing, unlike face editing approaches
based on an underlying PCA representation. This is crucial, since
a simple mathematical argument shows that direct manipulation is
sometimes less efficient that parameter editing, while
the converse is equally true in other cases.




VisualIDs:
User interfaces need scenery, and suitable
scenery can be invented with computer graphics techniques.




Pose Space Deformation
is a creature skinning/deformation algorithm that combines aspects of skeletondriven skinning and blendshapes and improves on each. Paper appeared in Siggraph 2000;
additional notes,
code examples,
video.




1) This sketch introduced the texturespace diffusion approach to subsurface scattering for skin,
developed for the Matrix sequels by George Borshukov and myself
(Siggraph 2003 sketch)
.
The TSD idea was developed and extended by
ATI
(see also)
and Nvidia incorporated the idea as part of their approach to
realtime skin rendering.
TSD is now widely used in realtime subsurface approaches.
2) The UCAP
dense markerless face capture used on Matrix sequels. slides
(Also see these other Matrix virtual actor sketches:
hair,
cloth)




Mapping the mental space of game genres.
Starting from a large online survey,
we use a classical psychological scaling technique, updated with
a modern manifold learning approach, to algorithmically produce
maps of the mental space of video game genres.




Limits to software estimation.
Can the prediction of development schedules and the assessment of programmer productivity and software quality
be codified and reduced to formula?
No: algorithmic complexity results can be directly interpeted as indicating that software development schedules, productivity, and reliability cannot be objectively and feasibly estimated and so will remain a matter of intuition and experience.
Large Limits to Software Estimation
Supplementary Material,
link to Slashdot discussion




Perceptual segmentation for NPR sketching.
Existing NPR sketching schemes have segmented strokes using relatively
simple measures such as curvature extrema.
These schemes disregard the successive approximation
nature of sketching  that large details are sketched first,
and small details (even if they contain curvature maxima) are only added later.
This Graphite05 paper
shows that spectral clustering can better
approximate the perceptually guided successive approximation used in sketching.




Face inpainting.
Which of these two people has thicker lips? A web survey shows that
humans find this question well posed and
are usually able to guess correctly.
This suggests that population face statistics can guide face inpainting.
On the other hand, we find that face proportion statistics are clearly nonGaussian,
so simple PCA based approaches will leave room for improvement.
We extend an active appearance model to employ local,
positiveonly reconstruction, resulting in surprisingly good extrapolation.




Feature tracking. "Fast Template Matching", Vision Interface 1995. Describes a frequencydomain algorithm for normalized crosscorrelation;
it also introduced the integral images idea to computer vision and image processing.
This algorithm is now included in the Matlab image processing toolbox and
used in several commercial software packages.
conference paper,
expanded/corrected version: .pdf,
ps.gz,
html.




Generalized fractals. "Generalized stochastic subdivision", ACM Transactions on Graphics July 1987 applies estimation theory to remove artifacts in the popular fractal subdivision construction and generalize it to arbitrary nonfractal power spectra (e.g. ocean waves), producing a multiscale texture synthesis. This is an early application of Gaussian Process regression in graphics.
The work is discussed in the book Peitgen and Saupe, The Science of Fractal Images (SpringerVerlag 1988), section 2.6.
The 1987 article with additional figures is here
(pdf,
html preview)
Also see ``Is the Fractal Model Appropriate for Terrain?'',
(pdf)




Accelerated Blendshape Animation. Blendshape modelers
spend considerable time attempting to make blendshapes that do
not interfere. This technique reduces the interference effect
in blendshape animation.




Automatic LipSync:
Automated lipsynch and speech synthesis for character animation
(CHI 87) describes an algorithm for identifying the mouth positions corresponding to speech in an audio soundtrack. Coarticulation was approximated with a nowstandard smoothing approach (smoothing splines). The algorithm has been covered in Siggraph character animation courses and is the subject of U.S. Patent No. 4,913,539.




Painting with Textures.
"Texture synthesis for digital painting", (early Siggraph paper) argued that the
expressiveness of traditional painting media can be considered as a texture
synthesis problem, and presented texture synthesis algorithms suitable for
interactive painting. Note that this was a number of years prior to the
introduction of programs like Photoshop and Painter.
A small gallery of paintings made with this program.
Paper, missing some figures.




Sparse Convolution
is a simple texture synthesis method that can resynthesize some (random phase)
textures given a sample. It was introduced in the Siggraph paper (above),
and explained in more detail in this
siggraph89 paper.
This algorithm removes the
directional artifacts found in the standard implementation of Perlin noise and
has other capabilities. The algorithm is used in
RenderDotC and Houdini.




Coherent Phase Texture Synthesis
(unpublished research)
Many texture synthesis approaches produce
more or less 'random phase' textures, i.e.,
coherent sharp 'structural' details are not present.
This synthesis algorithm has a coherence knob that
can produce such structural details in a synthetic texture.
Sample image




Neural Network pattern synthesis.
"Creation by Refinement" (ICNN88) describes a neural net algorithm for pattern synthesis
 essentially the same algorithm as Google's Deep Dream. The paper also described use of adversarial training examples.
Some initial (and not very impressive) explorations of
these areas appeared in the MIT Press book Music and Connectionism; a proposal
for imagebased texture synthesis and a trivial demonstration of such appears
in the proceeding Graphics Interface 91. Recall the computers at the time were around 0.11Mflop, a factor of 100s of million slower than the GPUs of Deep Dream.
(pdf, missing some figures).




Agent interface metaphor.
Soft Machine: a personable interface,
(Graphics Interface)
described an early conversational interface agent (possibly the first system
in which the agent is personified with a graphical likeness) employed as a
user interface  a "prehistoric" Siri or Alexa. In this system, a user accessed a video database via stateless
conversation with a graphical robot. The system used speech synthesis,
lipsynch, and restricted domain speech recognition orchestrated by a standard
conversational Lisp program. The use of primitive behaviours in the robot
(subtle head movement, two `emotional' states) and (especially) humor in the
robot's responses was intended to compensate for limited speech recognition
accuracy.
A video
showing Patrick Purcell (MIT Media Lab) interacting with the system to query an architectural database on videodisk.




Eye contact by imagebased rendering.
"Teleconferencing Eye Contact" (joint work with M.Ott and I.Cox at NEC
Research). This work addressed the videoconferencing eye contact problem using view morphing
(and is one of the earliest publication on these subjects).
Two cameras are mounted around the monitor, and stereo
reconstruction is used to form a crude 3D depth model which is filled in and
smoothed using scattered data interpolation; this model is then used to warp
one of the views. A nonrealtime implementation of the
technique shown at CHI93
demonstrated that within a small range of head orientation, the
impression of eye contact is apparently determined by the overall proportions
of the facial projection rather than the eye projection specifically, and thus
even a lowresolution depth reconstruction provides an impression of
eyecontact. This technique is the subject of US patent #5,359,362.








