Linux LDAP authentication with Samba4

Given a fresh Samba4 domain setup and a bunch of Linux/Windows client machines, how do you configure them to allow logins from domain users–ideally with a shared userspace mounted from Samba4? In this post, I’m going to share my experience on how I setup Ubuntu clients to authenticate via Active Directory. The default behavior in Samba4 is to use winbind, which keeps a local flat file of uid/gid mappings. I prefer letting LDAP hold that data. It’s possible to edit user accounts in the Active Directory tree to include the posixAccount LDAP class with uidNumber/gidNumber/unixHomeDirectory/loginShell attributes, and setup each Linux client to talk to the Active Directory domain controller merely as a Kerberized LDAP server. As a result, we can have a single account and password for both Windows and Linux users across the domain. We don’t have to join the Linux machines to the domain, we just need to export the correct Kerberos credentials to each client. I learned about this technique primarily from this blog, although I also heavily referenced the Samba4 mailing list.

Note: If security is top priority for you, you may prefer the full domain join route for Linux machines. In fact, it’s probably best not to bother with Samba4 at all in that case…


You’ll need to make some minor changes to Samba4. Under /usr/local/samba/etc/smb.conf (or wherever this file is stored), add the following line:

idmap_ldb:use rfc2307 = Yes

This instructs Samba4 to rely on LDAP for uid/gid mapping for users. Otherwise, it will use the local winbind flat file database.

Admittedly, I haven’t found a lot of information on how this works, so I’m speaking mainly from what I’ve observed. Take it with a grain of salt. If you want to be able to resolve usernames/groups on your Samba4 server, you’ll need to perform the following steps on the server machine as well as any clients.


The key package you will need to make this work is nss-pam-ldap. You can find it here. As stated on the website, this package provides a PAM module and daemon (nslcd) for querying and authenticating to an LDAP server. Follow the instructions here on how to set it up. I don’t want to reiterate everything that’s stated in the documentation, so I’ll just add some things.

  • If you want to connect to a remote LDAP server, make sure you specify the address in the form ldap://IP_ADDRESS. I made the mistake of using the hostname and got strange errors…it took me a while to track that one down.
  • The /etc/nsswitch.conf portion tells your machine where to look for user/group/host name resolution. It’s an essential step, so don’t forget it.
  • When I installed the package, I’m pretty sure it edited the pam configuration files for me. Check to be sure. Also, be careful editing these files because you could easily lock yourself out of the machine.

Active Directory requires Kerberos authenticated queries. You might be able to specify a domain user and password, but a kerberos keytab is probably better. The keytab is a principle/encryption key pair that grants the right to request keys without a password. I’ll explain more about this in a bit. Before we make changes to /etc/nslcd.conf to integrate that component, we need to do some Kerberos related setup.


I recommend reading up on Kerberos; it’s a really neat technology that Active Directory utilizes heavily. Here are some terms you’ll find thrown around a lot:

Realm – The Kerberos Realm identifies a specific Kerberos database stored on a single machine (KDC). It’s usually the same as the domain name, but it doesn’t have to be.

Principle – *A Kerberos principal is a unique identity to which Kerberos can assign tickets. Principals can have an arbitrary number of components. Each component is separated by a component separator, generally `/’. The last component is the realm, separated from the rest of the principal by the realm separator, generally `@’. If there is no realm component in the principal, then it will be assumed that the principal is in the default realm for the context in which it is being used.

Keytab – *All Kerberos server machines need a keytab file, called /etc/krb5.keytab, to authenticate to the KDC. The keytab file is an encrypted, local, on-disk copy of the host’s key.

*Taken straight from the MIT documentation.

Before messing with these though, we need to create a domain user.

samba-tool user add ldap-service
samba-tool user setexpiry ldap-service --noexpiry

We then need to export a keytab for this user, which will allow us to request keys without a password:

samba-tool domain exportkeytab /etc/nslcd.keytab --principal=ldap-service

You’ll need to copy this key to any of your clients that use nslcd. Keep in mind that this key grants domain user privileges without using a password, so limit the permissions on this file heavily.

Before continuing, you should make sure you having these packages installed: kstart, libsasl2-modules-gssapi-mit. If you forget the latter, you’ll get a “no suitable SASL mechanism found” error.

Once again, open up nslcd.conf and add the following lines:

# samAccountName holds the username in AD. It's a suitable substitute for uid. We can map it to avoid duplicating data.
map passwd uid samAccountName

# homeDirectory is already in use by AD for storing the Windows directory. We'll store ours in unixHomeDirectory
map passwd homeDirectory unixHomeDirectory

# Enable Kerberos authentication
sasl_mech GSSAPI
krb5_ccname /var/run/nslcd/nslcd.tkt

The last line points at a ticket file that we haven’t done anything with yet. The keytab file just gives you the right to request tickets. A ticket actually gives you access to resources, and has an expiration date associated with it. We need to tell nslcd to request a tickets at a certain interval using this keytab file. To do this (in Ubuntu), edit /etc/default/nslcd to look like the following:


The nslcd service will use a daemon called k5start to request a new ticket at specified certain interval.

As an important sidenote, there is a service called nscd (not to be confused with nslcd) that caches name lookups. It’s good to have running on all your clients because it takes the load off your LDAP server. However, during testing you should turn it off via `service nscd stop` so that you’re actually talking to the LDAP server and not a local cache.

User Accounts

At this point, you should be able to start nslcd and run getent passwd without things hanging. Make sure to check your system log to verify that things are working. Keep in mind that you can run nslcd -d (with multiple d’s for more output) directly on the command line to see error output. You’ll want to check that LDAP binds are working. However, you won’t actually see any domain user accounts listed in the getent output because we haven’t added posixAccount classes to them yet.

The easiest way to do this is through the ldb* set of commands provided by samba. If you run kinit administrator, you should be able to connect to your Samba4 LDAP server with Kerberos authentication by using the “-k yes” flag. As an example:

 ldbsearch -H ldap://SERVER_HOSTNAME -k yes

should print the entire tree. I noticed that when using these utilities, I have to specify the hostname of my LDAP server. Using the IP address fails with NT_STATUS_INVALID_PARAMETER. I’m not sure why that’s the case.

In our case, we want to modify a user entry to include the posixAccount data. If you haven’t already, make a domain user using the samba-tool user add interface.

Next, create a .ldif file and add the following lines:

changetype: modify
add: objectClass
objectClass: posixAccount
add: uidNumber
uidNumber: UIDNUMBER
add: gidNumber
gidNumber: GIDNUMBER
add: unixHomeDirectory
unixHomeDirectory: HOMEDIRECTORY
add: loginShell
loginShell: /bin/bash

Obviously, you’ll need to replace the upper case entries with your specific data. Once you have this file, you can write it to the LDAP tree with the following command:

ldbmodify -H ldap://SERVER_HOSTNAME -k yes /tmp/YOURFILE.ldif

Now your user has all the data it needs to authenticate and login to a Linux client! If everything works, you should now be able to type getent passwd YOURUSER and see their entry in the output. You should also be able to su – YOURUSER and authenticate. If you’re lucky and it works, congrats! You’ve just logged in to your Linux client through Active Directory! It you’re like me, things probably didn’t work the first time. Make sure to check the logs and utilize debug mode for the nslcd and samba services. Good luck!


Joining Samba4 to an Existing Windows Domain.

Note: See my first post for more context.

I am developing an small Active Directory infrastructure for the Computer Science Department Linux/Windows lab environment at Taylor University. Lab users have a central user space shared by both the Linux and Windows profiles, which authenticate from to Active Directory. Our Windows profile setup utilizes folder redirection to the network share for most profile folders (Desktop, Music, Pictures, Downloads, AppData/Roaming, etc), while our Linux space is NFS mounted by clients. Since we favor Linux administration, we prefer to keep our user files stored on a Linux file system. The Samba4 server grants CIFS access to Windows users, while Linux users can automount via NFS. LDAP authentication is handled through Kerberos on both the Windows and Linux clients. While Windows clients are joined to the Domain, Linux treats Active Directory merely as a Kerberized LDAP server. Finally, each user account is extended with a posixAccount class allowing us to give each user the uid, gid, unix home directory, and login shell attributes they need to properly interface with nslcd (the ldap client daemon) for Linux logins.


I started by setting up two Windows Server 2008 R2 machines. I created a new domain, installed the Active Directory Domain Services and DNS roles via the dcpromo.exe utility, configured both machines as domain controllers. This process is very well documented by Microsoft so I won’t describe it here. It’s important to ensure that directory and DNS replication is happening between these machines. I chose two Windows domain controllers for a few reasons:

  1. Demoting any DC appears to fail when a Samba4 DC exists in the domain. I’m not sure if this is specific to my setup, but it’s just one of the many bugs I’ve found so far (I hope they fix it soon). As such, I don’t really trust S4 to keep my production directory data intact, and if I have to demote (read: manually delete) my Windows DC for any reason, I want the directory data housed within Windows DC.
  2. Related to the above comment, I’m not comfortable yet entrusting S4 with master roles, thus I need two Windows boxes in case I need to bring one down.
  3. Active Directory is highly reliant on DNS. As such, I strongly prefer the Windows DNS to a hacked up BIND solution or S4’s brand new internal system. Having a two replicating Windows DNS servers is even better.
  4. Apparently, the Samba4 DNS doesn’t play well with Windows DNS yet. Once again, if I have to choose…

Anyway, after configuring the Windows machines, the next step was to create an Ubuntu Linux server (I chose Ubuntu because it’s heavily favored by the Samba4 team) and join it as a DC to the existing Windows domain. You first need to prepare your server for Samba4. Then, you need to setup Samba to join an existing domain.

Rather than regurgitate what is already stated on that Wiki, I’ll just note some things.


When you setup the Kerberos configuration file as stated in the Wiki, it’s important to test it out via kinit administrator@YOURREALM. By authenticating, you’ve requested a ticket from the Kerberos server, which is then placed in the default ticket cache, located at /tmp/krb5cc_0. If this isn’t working properly, the provision step will fail–probably with an LDAP authentication error.

Domain Name Service

If you’ve followed the Wiki instructions correctly up to this point, the provision step should work. After doing lots of test provisions, I’ve noticed that the DNS replication doesn’t always work (this is actually a known documented issue I guess). You can run the test commands they give you in the Wiki, or use the Windows DNS GUI to check that entries exist–my preferred approach. The Active Directory DNS tree heavily utilizes SRV records for services. The main ones we care about are Kerberos, LDAP, and GC (Global Catalog). Browse the DNS tree and make sure that you can see your Samba4 server listed along with the other Windows DC’s. In my configuration I noticed that under DomainDNSZones and ForestDNSZones, my Samba4 server is not listed. I think this is intended behavior because the Windows DNS and Samba4 DNS don’t work together quite yet.
If you notice that those entries don’t exist, there’s a handy script called samba_dnsupdate, located in INSTALL_DIR/samba/sbin/. That should replicate the DNS entries you need to the Windows DC’s. In the cases where the provision step failed to do this for me, running that script fixed the problem. Your mileage may vary.

Directory Replication Service

You can view the replication status by running the following command: samba-tool drs showrepl. If things don’t look like they’re working, first make sure you’ve started Samba! It may take it a few minutes to “warm up.” You can force replication by running:

samba-tool drs replicate <destinationDC> <sourceDC> <NC>

The NC term is your LDAP naming context; to replicate the whole tree, use your base Distinguished Name, e.g. DC=SOMETHING,DC=TEST,DC=COM. Under the Active Directory Sites and Services subtree in the Windows Server Manager GUI, you can view the connections between servers (Sites -> Default-First-Site-Name -> Servers -> YourS4Server). I’ve done several test configurations, and every one of them throws an error when I try and replicate to the Samba4 server from a Windows DC. This is the error I get:

The following error occurred during the attempt to synchronize naming context DomainDNSZones.<DOMAINNAME> from Domain Controller <Samba DC> to Domain Controller <Windows DC>: The naming context is in the process of being removed or is not replicated from the specific server. The operation will not continue.

Interestingly enough, if I replicate via samba-tool using the same source and destination servers, everything works. I’m assuming there’s some incompatibility there that hasn’t been addressed yet by the Samba team. If anyone has a solution for this please let me know. So far it hasn’t caused any problems.
Another good test of replication is to create a user account and see if it transfers across domain controllers. You can do this via samba-tool user add.

Distributed File System Replication

Samba4 does not yet support DFS replication. For our purposes, the main issue this causes is group policy synchronization issues; namely, the SYSVOL folder–which houses group policy objects–is not replicated from the Windows domain controllers to the Samba server. As a result, you’ll need to manually mount the SYSVOL share, copy the data over, and then run samba-tool ntacl sysvolreset to fix the permissions. Ideally, you should write a script to do it. This only needs to be updated when you’ve changed a group policy object. This is a bit annoying, but it’s not a show stopper. If you don’t do this, then your clients will apply different group policy settings based on which controller they happen to talk to, which is bad.

Coming Up

At this point in the setup process, I had my two Windows and Samba4 domain controllers communicating pretty happily. In my next post I’ll talk about how I setup my Linux and Windows clients to authenticate, login, and access user data from the Samba server. Stay tuned!

Permutations and Combinations

In the last couple years I’ve regained interest in solidifying my math chops. One mathematical concept that I see popping up again and again is that of permutations and combinations. I tried searching around for quick explanations, but I wasn’t really pleased with what I found. I thought I would share an approach that really helped me get it.


Finding the number of permutations of a unique set is simple, but tricky to understand. Once it “clicks,” you’ll think to yourself, “why did I think that was so hard?” (if you’re like me anyways).

Given a set with unique elements, you can permute it by mixing up the elements. Finding the permutations of a set involves figuring out all the unique ways you can mix those elements. The key concept in a permutation is ordering: it matters how the elements are ordered. Thus, the permutation \left \{ab \right \} is distinct from \left \{ba \right \}. The question, “how many permutations are there of a set?”, is asking for all possible orderings of the elements in that set. Consider the set:

S = \left \{abc \right \}

What are all the permutations of S? One way to do this by hand is by picking an element to go first and enumerating the permutations of the remaining elements:

(a goes first)

\left \{abc \right \} \left \{acb \right \}

(b goes first)

\left \{bac \right \} \left \{bca \right \}

(c goes first)

\left \{cab \right \} \left \{cba \right \}

As you can see, there are 6 permutations of this set. One thing to notice is that there are 3 members in the set, and we could pick any of those to be the first element. More generally, given N elements in a set, there are N ways to pick the first element.

But what about the second element? Assuming all members are unique, we can take the subset of N-1 elements and pick any of those to be the second. For example, let’s assume we’ve picked a already, we’re left with a subset of S without a:

\left \{bc \right \}

This particular set is trivial, but let’s stick with our plan:

(b goes first)

\left \{bc\right \}

(c goes first)

\left \{cb \right \}

We end up with only two permutations of this subset. More generally, we’ve got N-1 elements, and we can pick any of those to go first in our subset.

You may be spotting a pattern here, and that’s the key insight. We started with N elements, and we’re free to pick any one of those to go first. Then, for every one of those N choices, we have a subset of N-1 elements. Following our pattern, we can pick any N-1 of those elements to go first in that subset. If we keep going, we end up with N-2, N-3, etc–all the way down to 1. When we reach a set of 1 element, we’re done. This is our base case.

The trickiest part is knowing how to combine these results.

Starting with N elements, we have N choices for the first one. For each of those N choices, we have N-1 sub-choices. These are multiplied together to give us the total number of choices, N * (N-1). But we aren’t done, because have to consider the N-2 choices for each of the N-1 sub-choices, for each of the N original choices, which is N * (N-1) * (N-2). If you continue this pattern all the way down to 1, you get the familiar factorial function, N! Going back to our example, we had 3 choices for our first element, which gave us the three lines above. For each of those 3 lines, we had a 2 element subset, of which the permutations are merely the 2 possible “flippings” of those elements. We ended up with 3*2*1 = 6 permutations, just as we expected.

Thus, the answer to our question, “how many permutations are there of a set?” is N!, where N is the number of elements in our set.


You may be familiar with the “n choose r” notation, which is denoted mathematically by the following equation:

\binom{n}{r} = \frac{n!}{r!(n-r)!}

This is a pretty scary looking equation, and it’s not easy to “reverse engineer.” the meaning by just looking at it. Fortunately, there’s an easy way to understand it that builds on the idea of permutations presented in the previous section, so bear with me.

First of all, what is a combination? Remember how permutations were ordered sets? The key behind a combination is that it is unordered. How many ways can you divide a group of 4 students into 2 person teams? That’s a combination problem. We don’t care who’s first or second on each team–only who is on each team. Thinking about it, combinations must somehow relate to permutations. More precisely, the combinations must be a subset of the permutations. Why? Well, if you have any combination of students on a team, you could generate permutations from that–by picking someone to go first, etc., like we learned above. It “feels” like there are duplicate sets being counted that we don’t care about–which is exactly the case. The equation above is basically computing all possible permutations and then “trimming” the duplicate results to give us the combinations.

With that insight, let’s examine that scary equation once again, just for a bit.

\binom{n}{r} = \frac{n!}{r!(n-r)!}

This equation is saying, “how many ways can we choose r elements from n elements?” Using our team example above, we’d be choosing 2 person teams from 4 total people. Let’s plug it in and see what we get:

\binom{4}{2} = \frac{4!}{2!(4-2)!} = \frac{24}{2*2} = 6

Given a set, \left \{abcd \right \}, it’s not too hard to pick out those 6 combinations:

\left \{ab\right \} \left \{ac \right \} \left \{ad \right \} \left \{bc \right \} \left \{bd \right \} \left \{cd \right \}

As you can see, each combination includes a unique group of people, in no particular order. We could visit each of these groups, permute them by picking all possible orderings, and combine the results to get 24 total permutations. I’ll leave that as an exercise for you. The key insight is that we can compute permutations from combinations. Each permutation is the same combination, just mixed up differently. Let’s just examine one of these combinations: \left \{ab\right \}.

Finding the permutations in this set is trivial. Here they are:

\left \{ab \right \} \left \{ba \right \}

There are 2! = 2 permutations here. These are the permutations of the combination, \left \{ab\right \}. Hmmm, it seems like we’re ending up with twice as many results as we want. For each combination, we have a duplicate. We need to take our original n! permutation estimate and somehow remove these duplicates. We do this by dividing by r!. Think about it, if we have a 2 element set, we end up double counting everything. We need to divide by 2. If we had a 3 element set, we could have even more duplicates. We would need to divide by 3! = 6.

Using our previous example, dividing 24 (the number of permutations of n elements) by 2 (the number of permutations of r elements) gives us 12–not 6. What gives?

We’ve still got the (n – r)! term in there. If you think about it, n-r counts the remaining elements not chosen in our combination. Let’s take a look at our team example once more to see why we need this.

Let’s say we’ve chosen the combination, \left \{ab \right \}. We know that this expands into 2 permutations, so we divide our total number of permutations (24 in this example) by 2, but we still have duplicates. The issue is that our n! permutations still include permutations of the remaining elements, which are incorrectly included in our count. Let’s examine our first combination once again:

\left \{ab \right \}

We know we can permute this, but we still have to combine it with the remaining unused elements, which combined form a permutation of the original set. Thus, even if we purge duplicates from our subset, we still end up with more duplicates. Assuming we’ve purged duplicates from our example subset, our combined set still includes the permuted unused elements:

\left \{abcd \right \} \left \{abdc \right \}

To correct this, we need to divide by the (n-r) element permutations, which gives us the missing (n-r)! term. If we divide our original n! permutations by these two terms, we finally get the correct result:

\binom{n}{r} = \frac{n!}{r!(n-r)!}

Samba 4: A Case Study (Part 1)

I work in the IT department for the Computer Science Department at Taylor University. We do things a little differently here in terms of our lab machine setup. We dual boot Ubuntu Linux and Windows 7, but we centralize our file serving and domain controlling through Samba on Linux–as opposed to a Windows Active Directory Solution. Currently, we use Samba 3.5.x as our Primary Domain Controller with an OpenLDAP backend to host user accounts. This has been working, but it has issues. Samba 3 was designed for the NT style domain, which is really starting to show its age when compared to modern Active Directory. For instance, to edit a group policy setting, we currently have to manually edit our Windows OS image and push it out to every machine (alternatively, we have an updater script that allows us to do it without re-imaging, but it’s still a pain).

I started following Samba 4 development late last year, which just had its first official release a couple months ago, and I’m now in the process of building a production active directory cluster to replace the Samba 3 server. The official HOWTO is a little skimpy (listed here:, so I thought  I would add some of my experience setting this stuff up. As a disclaimer, I’m no expert, so this just my understanding based on messing around with it. If you notice something that’s not correct, please leave a comment or email me.

System Architecture

Before I go into details, I wanted to give a bit of an overview of the server architecture I went with. Although the Samba team would love for you to use Samba 4 exclusively, my personal experience tells me that it really should have a Windows DC attached to it (two would be better). I guess it depends on your specific needs, but one big reason we’re opting for Samba is the unification of file space and permissions between Linux and Windows.

One key thing to know about Active Directory is that it’s deeply tied into DNS. You can’t query for anything in the domain without a working DNS configuration. For a Linux-only solution, there are two options: use a heavily modified Bind configuration integrated with Samba (which has issues), or just choose the new default internal DNS server built by the Samba team.

Messing with Bind isn’t my cup of tea, and if you’re looking for help there, I’m not your guy. I’m not really excited about using the custom DNS solution either, for a couple reasons. For one, it’s brand new, so it’s a security risk, and it lacks the stability of industry standard DNS solutions. It seems like the Samba team just got fed up hacking Bind to match their needs, so they just wrote their own. That’s all well and good, but I’d rather use the Windows DNS solution. It’s turnkey, stable, and still allows for Linux integration I need.

I fiddles around with a couple different server configurations. I first tried using S4 as the initial DC. I built a Windows Server 2008 R2 machine (sidenote: don’t bother with Windows Server 2012, it’s not supported yet), and tried joining it to the Samba domain. I kept running into strange errors in dcpromo.exe, so that didn’t work.

In my second attempt, I built a Windows server box and joined the S4 server to it, using the instructions listed here:

This worked pretty well. I was able to join successfully and get replication working; I really like having the Windows GUI for DNS and Group Policy settings accessible from Windows. Because I don’t quite trust S4 yet, I added  a second Windows DC to the cluster. I’ve noticed that the S4 server screws up some features. For instance, I can’t demote any of the domain controllers (I get weird errors), so I have to manually remove the connections and DNS entries. This isn’t the end of the world for us because we have a small cluster. For larger clusters this would be a big show stopper. I’ve seen some chatter about this on the samba news list, but no solutions.  There are other minor issues I’ve run into, but I won’t list them here. I’ll probably make a separate post for that.

Those hiccups aside, I’m able to replicate the directory entries between all domain controllers. I can also create accounts on either the S4 or Windows machines successfully. I created a couple fresh Windows 7 builds and configured some group policy settings for them (including folder redirection and roaming profiles). It didn’t take too long to get that working.

There’s a lot of details I’ve come across in setting this up, so I’m going to try and spread it out over a few posts. Stay tuned for updates! If you have any questions about things, feel free to comment or email me.

Gaussian Blur Effect

Note: This is taken from the MetaVoxel development blog,

I started a task last week to add a simple Gaussian blur effect to the background blocks in the scene.  Before we were just drawing them with no filtering, so they looked “pixelated.” I tried using bilinear filtering, but that looks pretty ugly too. It ended up taking way too much time, but I learned a lot about image compositing and running convolution kernels on the GPU. Here are the results:

blur blur2

The algorithm splits the scene up into layers, similar to how you might do it in Photoshop. The “blur” layer is saved to a separate render target and filtered using a two-pass Gaussian blur. This makes for a nice softening effect. The layers are then composited together in a final merge step and rendered to the screen.

For those who care about the details, I ran into some issues with compositing, and it turned out that I’d been thinking about alpha blending all wrong. Most resources I’ve seen out there use linear interpolation for blending, based on the source alpha. The equation looks like:

dest = src_alpha * src_color+ (1 – src_alpha) * dst_color

In OpenGL, this is done through glBlendFunc( GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA ).

There are some issues with this blend mode. For one, how do you represent complete transparency? Is it (0, 0, 0, 0)? Or (255, 255, 255, 0)? You can pick any arbitrary “color” for your transparent surface, which doesn’t really make physical sense. Even worse, if you render a texture into a transparent render target, you end up with discolored fringes:


The solution is to use premultiplied alpha. We instead use the blend mode:

dest = src_color + (1 – src_alpha) * dst_color

This way, (0, 0, 0, 0) is the only true transparent color. It reacts properly to filtering and avoids the ugly fringes. Once I switched the alpha blending to use premultiplied alpha, I was able to set the layer render targets to fully transparent, render into them, and then blend the layers together correctly.

This requires a bit of overhead for texture loading, because the input color is multiplied by the alpha channel. The results look very good though.

Buffer Streaming in OpenGL

I spent a bit of time recently designing a sprite batching system using the (more) modern OpenGL 3.x core functionality rather than the old immediate mode stuff. I thought I would share some of my observations about working with it. My first approach to the problem consisted of allocating a single, large dynamic vertex buffer object, which I mapped via glMapBuffer() to copy in data for each batch of sprites. I used a static index buffer that I locked once and filled at initialization time, since I only ever needed to draw quads. This worked fine, except it was wicked slow.

For one, MetaVoxel uses alpha blending very heavily, requiring everything to be drawn back to front. As you might expect, this results in an output sensitive batch size, because it relies on the next quad requiring an identical rendering state. Sorting by texture isn’t possible, unfortunately. Additionally, each time I mapped the buffer, I would map the entire thing all at once–even for small batches. Anyway, it didn’t take much to bring things to a crawl. Profiling revealed a vast majority of time spent in the driver, waiting on glMapBuffer.

Clearly, I was doing something wrong. The key issue turned out to be buffer synchronization between the CPU and GPU. I found some great resources below that go into detail explaining how to optimize this.

I won’t rehash the details of how buffer orphaning works, see the above articles if you’re interested. The basic idea is that the driver is very conservative and will happily stall waiting for a buffer to flush to the GPU before letting you write to it again. You either need to coax the driver to allocate a new buffer for you, or implement a ring buffer scheme yourself. I ended up implementing the following techniques that got things running smoothly.

    I used glMapBufferRange to map only the required amount of data per batch, specifying the GL_MAP_INVALIDATE_RANGE_BIT.

  • I created a ring buffer of vertex buffer objects, mapping each one in sequence with each new batch. Rather than coaxing the driver to give me a fresh set of data each time, I implemented it myself.
  • I specified GL_STREAM_DRAW as the driver hint.

When I tested this on several machines, I saw vastly better performance–indicating less synchronization between the CPU and GPU.

MetaVoxel: The Technology

We are building MetaVoxel in C++ with SDL/OpenGL, all of which are free, cross platform technologies. Since October, we’ve accumulated over 18,000 lines of code (and counting!), and built some pretty neat tools to aid us in developing this game. Our primary tool is the Voxel editor, which lets us cycle between prefab elements and place them in the world. From there, we can push and pop into metas (a silly term we’re using to describe a mini-world element) and edit them too.

For example, here’s a map I threw together last night:



Mind you, there’s lots of visual elements enabled here, so it probably just looks cluttered, but Stett and I are finding that we can work pretty quickly to crank things out. In the lower left, you can see a stream of console output. We decided to rely heavily on console commands to relieve the burden of crafting a fancy user interface. It’s easy to drop in a new command if we need more control. We also have a limited UI library we’ve written for other parts of the engine that really benefit from it (see below).

We’ve distinguished “entities” from static parts of the map. Those are highlighted in green in the editor. Pretty much any game object that can move (with some rare exceptions) and interact with the player is considered an entity.

Building the static map is fun and all, but this is a game about puzzles! To craft the object interactions, we’ve built a link system within our entity logic. Essentially, it’s a distributed event system that allows entities to communicate with each other when certain important things happen. For instance, when the player picks up a key and presents it to a lock, the engine needs a way to tell the lock, “hey! you’re being unlocked.” That way, the lock can do stuff like play its unlock animation. Furthermore, the events can be linked to fire conditionally when something else happens. I tested this by linking a bunch of locks together so they would unlock in sequence.

Originally, we just used console commands to link entities together. Not only was this tedious, but it didn’t really give us enough information. It was hard to see which entities were linked together, and you couldn’t really tell which entity was linked to whom. So one weekend, I worked way too much and built a link editor to make this easier:





The frame on the left shows the “meta-tree” while the right side shows the current workspace for linking. You can link entities from arbitrary metas using this shared workspace.

We thought it would be really cool to build an animation system for characters. Rather than just cycle through frames of a sprite, our system lets you keyframe 3D positions of a character, along with the rotation/scale in 2D. Crafting those animation files by hand would be a pain (and waste of time), so we built a simple animation editor:



Yeah, it’s not very pretty. The weird gray background is Stett’s fault, he needed a middle gray to check for contrast issues in sprites. Utility trumps beauty I suppose. Also, a simple issue with our lighting system renders Jacob too dark; we haven’t fixed it quite yet. At any rate, it does what we need! We’ll release a video soon to show off the animations. I think they’re pretty cool. The death animations will be the best.

Naturally, there’s still lots to do. We’re really trying to double down and focus on developing fun puzzles. The extra stuff is nice, but the game needs to be fun right? Yeah, we think so too.

Thanks for reading! Here’s some more screenshots for kicks:

neat tower2