Implementing a .vmf parser

April 14, 2018

Im my continued quest into Golang, and war on Source Engine, I found myself led towards the raw map format from which .bsps are generated. This format (.vmf [Valve Map Format]) resembles JSON quite heavily, but insufficiently that existing parsers could handle it. It’s also very strict in its layout, so being able to build a predefined structure as much as possible could be a real benefit.

So what do we do? Write our own of course!

What does a vmf look like?

At its core, the vmf format is a set of explicit root Nodes (or a single root node if you want to consider the vmf itself as the root key). The key is always textual, and the value can be either a whole host of types, readable as text such as a Vec3 or float, or a new Node.

Here is a small extract from a vmf I created a while ago:

versioninfo
{
	"editorversion" "400"
	"editorbuild" "6920"
	"mapversion" "919"
	"formatversion" "100"
}
visgroups
{
	visgroup
	{
		"name" "spires"
		"visgroupid" "20"
		"color" "149 106 131"
	}
}
viewsettings
{
	"bSnapToGrid" "1"
	"nGridSpacing" "128"
	"bShow3DGrid" "0"
}
world
{
	"id" "1"
	"mapversion" "919"
	"classname" "worldspawn"
	"detailmaterial" "detail/detailsprites"
	"detailvbsp" "detail.vbsp"
	"skyname" "sky_borealis01"
	"spawnflags" "0"
	solid
	{
		"id" "3"
		side
		{
			"id" "7"
			"plane" "(-1728 768 32) (-1728 768 -32) (-1728 960 -32)"
			"material" "TOOLS/TOOLSNODRAW"
			"uaxis" "[0 1 0 0] 0.25"
			"vaxis" "[0 0 -1 0] 0.25"
		}
	}
}

Its worth noting that we know a lot about the structure from the valve page: https://developer.valvesoftware.com/wiki/ValveMapFormat. The page also provides a comprehensive list of the possible top-level nodes. This is useful for providing a simplified means for defining our struct to represent parsed files.

A solution

We can end up with something like this:

type Vmf struct {
	VersionInfo Node
	ViewSettings Node
	VisGroup Node
	World Node
	Entities Node
	Cameras Node
	Cordon Node // Pre-L4D only
	Cordons Node // Post-L4D only
	Unclassified Node
}

Note a few of things. Firstly that this isn’t versioned, as Cordon and Cordons are mutually exclusive depending on engine version. This could be solved by providing different structures per engine version, making the members private and exposing them via implemented methods of a VMF interface.

Second, we have an Unclassified property, not described by the documentation. The reasons for this are two, there are rare cases when an author may want to include some custom information that is stripped out when compiling. So long as the contents are valid, vbsp will just ignore them. We still want to capture this information if it exists. The other reason is relating to the bsp lump entdata. Entdata is a lump in the vmf that describes all entities in the map. It just so happens (for obvious reasons really..) that the entdata block conforms to the exact same spec as VMF, and so can be parsed in the same way, with a single exception of the parser quite reasonably being unable to determine what root Node the data belongs to. Unclassified can be reused in this case. This is great, as when looking at entdata in the future we have a library ready to go, although a wrapper may be useful to abstract away the concept of a Vmf.

Thirdly, that each property is a Node. As we are modelling the Vmf, a Node is a simple object that looks like:

type Node struct {
    key string
    value interface{}
}

This is not perfect, but value is either a string or a []Node, and deriving which type is trivial. So we create a simple Node tree that we can traverse to find something(s) we want.

Overall usage of the solution can be seen over on the project page, but here is a simple example:

package main

import (
    "log"
    "github.com/galaco/vmf"
)

func main() {
	file,_ := os.Open("de_dust2.vmf")

	reader := vmf.NewReader(file)
	f,_ := reader.Read()

	log.Println(f.Entities.GetChildrenByKey("entity")[0].GetProperty("classname"))
}

Gotchas

Vmf is actually a fairly simple format when broken down. However, there are rules that are valid but no defined by the the exist documentation. The most notable being that Node values can span multiple lines. For example:

{
   "classname" "game_text"
   "targetname" "why_is_it_always_this_entity_that_causes_problems"
   "message" "this is some text that contains

some newlines"
}

The above is perfectly valid in pre-CS:GO Hammer versions, and valid in all engine versions. As such, parsing line-by-line becomes a little tougher, but not unbearably so.

Anyway, you can find the library/code here: https://github.com/galaco/vmf

Thanks for reading.