The Name
Drake is a type of dragon. Dragons are gold keepers, and this system is also golden. :) With this system, I started a naming convention related to mythological creatures/entities.
Background
We faced significant performance issues with MeshRenderers
. Internal managers for MeshRenderers
were (and may still be) very CPU-intensive, as was frustum culling. However, we needed to use many renderers.
At the time, Entities were nearing version 1.0, and my tests with heavy renderer loads showed much better performance. The idea was straightforward: use Entities for rendering. The requirements were:
- Minimal changes to artists' workflows
- Only rendering changes (physics and other systems remain unchanged)
- Interactivity with Unity's OOP world (most renderers are static, but hundreds require some interaction):
- Follow transforms
- Copy enable/disable state from corresponding
GameObject
- Process material change requests (a recent addition)
- Streaming via Addressables, as Entities' content management would duplicate assets, and we were unaware of Addressables' scaling limitations
Subscene systems were not feasible for this use case.
First steps
Let's start simple. Just take MeshRenderer
and convert it at runtime to entity. It's easy, just use RenderMeshUtility.AddComponents
, nice win.
The next step was to remove MeshRenderer
and move all required data to DrakeMeshRenderer
:
Mesh
Materials
RenderMeshDescription
Editor code was implemented to convert MeshRenderer
to DrakeMeshRenderer
. The DrakeMeshRenderer
itself is just a few lines of code:
var count = _materials.Count;
var renderMeshArray = new RenderMeshArray(_materials, new Mesh[] { _mesh }); // Allocation :(
var localToWorld = new LocalToWorld(){ Value = transform.localToWorldMatrix };
_entities = new NativeArray<Entity>(count, Allocator.Persistent, NativeArrayOptions.UninitializedMemory);
for(var i = 0; i < count; ++i)
{
var entity = entityManager.CreateEntity();
RenderMeshUtility.AddComponents( // Heavy :(
entity,
entityManager,
desc,
renderMeshArray,
MaterialMeshInfo.FromRenderMeshArrayIndices(i, 0, i)
);
entityManager.AddComponentData(entity, localToWorld);
_entities[i] = entity;
}
LOD's
Next, we needed to handle LODGroup
conversion by creating entity with LODGroup
data and notifying rendering Entities that they are linked to it.
Entities LodGroup
We need MeshLODGroupComponent
. It's easy code:
public void Initialize(LODGroup lodGroup)
{
var transform = lodGroup.transform;
var worldSize = LodUtils.GetWorldSpaceScale(transform) * lodGroup.size;
localReferencePoint = lodGroup.localReferencePoint; // Local space
lodDistances0 = new float4(float.PositiveInfinity);
odDistances1 = new float4(float.PositiveInfinity);
var lods = lodGroup.GetLODs();
for (var i = 0; i < lods.Length; ++i)
{
var d = worldSize / lods[i].screenRelativeTransitionHeight; // But here world space ugh
if (i < 4)
{
lodDistances0[i] = d;
}
else
{
lodDistances1[i - 4] = d;
}
}
}
DrakeRendererManager and modifications of DrakeMeshRenderer
Now DrakeMeshRenderer
needs to add to its entities MeshLODComponent
. So we need to save lod mask at conversion time and we need entity spawned by DrakeLodGroup
.
Let me introduce DrakeRendererManager
, which creates both renderer and LOD group Entities. so after lod group entity is created we can pass them to spawning of renderers.
For simplicity, we maintain a link between DrakeLodGroup
and DrakeMeshRenderer
:
-
DrakeMeshRenderer
has an optional reference toDrakeLodGroup
. -
DrakeLodGroup
has an array ofDrakeMeshRenderer
references, requiring at least one entry (otherwise, it’s an invalid LODGroup).
DrakeRendererManager
has two Register
methods: one for DrakeLodGroup
(DLG
) and one for DrakeMeshRenderer
(DMR
), also called from DLG
for each child. Both DLG
and DMR
call the appropriate method from Start
.
Streaming
Streaming is critical to load only necessary meshes and materials. RenderMeshArray
and MaterialMeshInfo
must be added and removed dynamically.
Addressables uses strings as keys, which cannot be bursted, so we store them indirectly.
Ladies and gentlemen, DrakeMaterialMeshInfo
:
public struct DrakeMaterialMeshInfo : IComponentData
{
public ushort meshIndex;
public ushort materialIndex;
public byte submesh;
}
Marvelous one. This component stores material and mesh indices, which require arrays to index them. This led to the creation of DrakeAddressablesManager
.
If C# would has header files, for DrakeAddressablesManager
it would be like that:
public class DrakeAddressablesManager
{
Dictionary<string, ushort> _meshKeyToIndex;
Dictionary<string, ushort> _materialKeyToIndex;
List<AddressableLoadingData<Mesh>> _meshLoadingData;
List<AddressableLoadingData<Material>> _materialLoadingData;
public:
ushort RegisterMaterial(string materialKey);
void StartLoadingMaterial(ushort materialIndex);
void StartUnloadingMaterial(ushort materialIndex);
ushort RegisterMesh(string meshKey);
void StartLoadingMesh(ushort meshIndex);
void StartUnloadingMesh(ushort meshIndex);
Optional<BatchMaterialID> TryGetLoadedMaterial(ushort materialIndex);
Optional<BatchMeshID> TryGetMesh(ushort meshIndex);
}
and if you wonder, AddressableLoadingData<T>
:
public struct AddressableLoadingData<T> where T : Object
{
public readonly string key;
public AsyncOperationHandle<T> loadingHandle;
public ushort counter;
}
During registration, we register keys and add DrakeMaterialMeshInfo
instead of RenderMeshArray
and MaterialMeshInfo
. Then other simple checks if the renderer should be loaded (based on a distance-to-camera vs. LOD distance comparison). If so, StartLoadingMaterial
and StartLoadingMesh
are called with the corresponding indices.
The next frame, another system checks TryGetLoadedMaterial
and TryGetMesh
. If both succeed, MaterialMeshInfo
is created and added to the entity.
You may ask:
What about RenderMeshArray
?
It’s very wasteful, as we know when resources are loaded. We manually replicate RenderMeshArray
’s functionality by obtaining EntitiesGraphicsSystem
and calling (Un)RegisterMesh
or (Un)RegisterMaterial
. This eliminates multiple managed array creations - double win.
Mipmaps streaming
It's not topic of this post, but quick mention.
Once mesh loading is complete, we collect UV distribution data. When material loading is finished, we register the material to the mipmap streaming system, which provides a MaterialMipMapsStreamingHandle
. After both the mesh and material are loaded and the Entity is spawning, UVDistributionComponent
and MipmapsStreamingComponent
are added to the Entity. From then on, mipmap streaming systems can process these and update the required mip level.
Let move it
Another requirement: possibility to link transform with renderer entity.
Transforms are one of Unity’s Four Horsemen of the Apocalypse. To avoid killing performance, we use TransformAccessArray
.
We model this as: after the simulation step, run IJobParallelForTransform
to write transform data back to linked Entities.
The challenge lies in the near-nonexistent documentation for TransformAccessArray
. Let me reiterate: one of the most impactful and essential optimization tool lacks explanation - good job Unity as always :)
It's worth mentioning how we store transforms in an ECS unmanaged component:
public struct LinkedTransformComponent : IComponentData, IEquatable<LinkedTransformComponent>
{
public readonly UnityObjectRef<Transform> transform;
public LinkedTransformComponent(Transform transform) {
this.transform = transform;
}
public bool Equals(LinkedTransformComponent other) {
return transform.Equals(other.transform);
}
public override int GetHashCode() {
return transform.GetHashCode();
}
}
The registration method collects new entities and registers each:
int Register(Entity entity, Transform transform)
{
if (_freeIds.Length > 0) {
var lastIndex = _freeIds.Length - 1;
var index = _freeIds[lastIndex];
_freeIds.RemoveAtSwapBack(lastIndex);
_transformsArray[index] = transform;
_linkedTransformEntities[index] = entity;
return index;
}
var newId = _transformsArray.length;
_transformsArray.Add(transform);
_linkedTransformEntities.Add(entity);
return newId;
}
_freeIds
tracks unoccupied transform indices in _transformsArray
. At the time, leaving removed transforms in TransformAccessArray
was faster than removing them and adjusting Entity indices. We also use LinkedTransformIndexComponent
, an ICleanupComponentData
storing the index to free.
Two way link
We can read managed data (track transform values), but we also need to manipulate Entities from the managed side.
If requested, LinkedEntitiesAccess
MonoBehaviour
is added during registration, holding a list of Entities linked to the GameObject
. A challenge arises because registration uses EntityCommandBuffer
, so the Entity isn’t immediately available. As a workaround, registration creates a LinkedEntitiesAccessRequest
with a UnityRef
to the LinkedEntitiesAccess
MonoBehaviour
. After ECB
playback, a system processes requests and registers valid Entities to the given LinkedEntitiesAccess
. This requires a managed, unbursted system, but I’m unsure if there’s a better approach.
Enable/Disable/Destroy
With access to entities we can react to GameObject
state changes: hide entities when GameObject
is disabled, show them when is enabled, and destroy renderers alongside with GameObject
.
Materials
Often there is need to modify material of renderers (strictly for renderer not material itself). There are two main modifications:
Replace value
This uses custom components tagged with MaterialPropertyAttribute
from Entities Graphics. No custom modifications are required, only a smart editor to display possible changes.
Replace whole material
To replace a material, we first register the new material with DrakeResourcesManager
, which returns a material index for creating a DrakeMaterialMeshInfo
.
Next step is tricky, as it depends on the current state of DrakeMeshRenderer
. There are five states to address, but all involve replacing the old DrakeMaterialMeshInfo
with a new one. Additional operations primarily include unloading the previously loaded material (decrementing its refcount) and adding or removing certain state components.
Scene unloaded
Scene lifetime management is required.
Startup is easy using MonoBehaviour
’s Start
(preferred, as spawning agents can edit properties, e.g., mark as non-static).
Unloading is much trickier, as no MonoBehaviour
remains (see Optimize/Leftovers section). To solve that issue I created SystemRelatedLifeTime<T>
class, with nested IdComponent
. IdComponent
has just int id
and it's ISharedComponentData
.
To make such generic component work you need to register every usage with line like:
[assembly: RegisterGenericComponentType(typeof(SystemRelatedLifeTime<DrakeRendererManager>.IdComponent))]
Now you can make query over that component, with filter for specific IdComponent
, and destroy all matching entities. It’s highly efficient and elegant.
Id for scene is just scene handle value.
Vertex snap hack
Our artists often use vertex snap tool in editor. Therefore Drake needs to support it, unfortunately there is no such functionality in entites. As a workaround, we made special mode when instead of creating entities we spawn old LodGroups
and MeshRenderers
as hidden in hierarchy children. That made editor flow very complicated (need to keep track of all Drakes, spawn in two ways, enter/exit play mode edge-cases, prefab editing nightmare, duplication edge-cases, undo and so on).
There is no much more to say, just that Unity editor support is very painful.
Optimize
Leftovers
Once Drakes are registered, there is no need to keep their corresponding MonoBehaviours
. In most cases, these are the only components
on a GameObject
, allowing the GameObject
to be destroyed. Often, this leaves the parent GameObject
as a leaf node with no components
, which can also be purged, and the cycle continues up to needed GameObject
.
This results in shallower hierarchies and allows many small memory allocations to be reclaimed faster (after a GarbageCollector run, of course).
Static
If you check entities graphics code, you can find that LodGroup
is just to update components on rendering entities.
For static groups with no moving parts, we can skip spawning LODGroup
and assign values at spawn time.
The plan is simple: for static group, spawn only fully set up rendering entities. First, we need to determine if the prefab is static. I used the static flag from the GameObject
, but since this flag is unavailable in builds, it must be cached during baking.
EntityCommandBuffer
You might hear that EntityManager
should be used whenever possible because it’s faster than EntityCommandBuffer
. We tested this, but it was significantly slower in our case. At Start, we spawn thousands of entities for Renderers
and hundreds for LODGroups
in chains of operations on these entities. Most operations occur after a scene loads, though a few take place during gameplay.
For LodGroup
chain is like:
- Create
Entity
- Set
LocalToWorld
- Set
MeshLODGroupComponent
- Set
LinkedTransformComponent
- Set
IdComponent
(shared component) - [Optionally] Add
LinkedEntitiesAccessRequest
For Renderer
:
- Create
Entity
- Set
LocalToWorld
- Set
WorldRenderBounds
- Set
DrakeMeshMaterialComponent
- [Optionally] Set
MeshLODComponent
- [Optionally] Set
DrakeRendererVisibleRangeComponent
- [Optionally] Set
LODRange
- [Optionally] Set
LODWorldReferencePoint
- [Optionally] Set
LinkedTransformComponent
- [Optionally] Set
RenderBounds
- [Optionally] Set
LinkedTransformLocalToWorldOffsetComponent
- Set
RenderFilterSettings
(shared component) - Set
IdComponent
(shared component) - [Optionally] Add
LinkedEntitiesAccessRequest
Archetypes
Adding components is expensive, so we create Entities with the correct archetype using DrakeRendererArchetypeKey
. It covers all configurations, and the manager generates archetypes for each.
Main logic looks like:
static DrakeRendererArchetypeKey[] CreateAllValues() {
// _static * _isTransparent * _hasLod * _inMotionPass * _lightProbeUsage * _hasShadowsOverriden * _hasLocalToWorldOffset
DrakeRendererArchetypeKey[] values = new DrakeRendererArchetypeKey[2*2*2*2*4*2*2];
int index = 0;
for (int i = 0; i < 2; i++) {
var isStatic = i == 1;
for (int j = 0; j < 2; j++) {
var isTransparent = j == 1;
for (int k = 0; k < 2; k++) {
var hasLod = k == 1;
for (int l = 0; l < 2; l++) {
var inMotion = l == 1;
for (int m = 0; m < 4; m++) {
var lightProbeUsage = m == 0 ? LightProbeUsage.Off : (LightProbeUsage)(1 << (m-1));
for (int n = 0; n < 2; n++) {
var hasShadowsOverriden = n == 1;
for (int o = 0; o < 2; o++) {
var hasLocalToWorldOffset = o == 1;
values[index++] = new DrakeRendererArchetypeKey(isStatic, isTransparent, hasLod, inMotion, lightProbeUsage, hasShadowsOverriden, hasLocalToWorldOffset);
}
}
}
}
}
}
}
return values;
}
var archetypeKeys = DrakeRendererArchetypeKey.All;
_entityArchetypes = new NativeHashMap<DrakeRendererArchetypeKey, EntityArchetype>(archetypeKeys.Length, Allocator.Domain);
foreach (var archetypeKey in archetypeKeys) {
_entityArchetypes.Add(archetypeKey, CreateArchetype(archetypeKey, entityManager));
}
// From Unity.Rendering.RenderMeshUtility.EntitiesGraphicsComponentTypes
EntityArchetype CreateArchetype(DrakeRendererArchetypeKey archetypeKey, EntityManager entityManager)
{
var components = new UnsafeList<ComponentType>(24, ARAlloc.Temp) {
ComponentType.ReadWrite<WorldRenderBounds>(),
ComponentType.ReadWrite<DrakeMeshMaterialComponent>(),
ComponentType.ReadWrite<PerInstanceCullingTag>(),
ComponentType.ReadWrite<WorldToLocal_Tag>(),
ComponentType.ReadWrite<LocalToWorld>(),
ComponentType.ReadWrite<MipmapsFactorComponent>(),
ComponentType.ChunkComponent<ChunkWorldRenderBounds>(),
ComponentType.ReadWrite<RenderFilterSettings>(),
ComponentType.ReadWrite<SystemRelatedLifeTime<DrakeRendererManager>.IdComponent>(),
ComponentType.ReadWrite<ShadowsProcessedTag>(),
};
if (archetypeKey.isStatic)
{
components.Add(ComponentType.ReadWrite<Static>());
}
else
{
components.Add(ComponentType.ReadWrite<LinkedTransformComponent>());
components.Add(ComponentType.ReadWrite<RenderBounds>());
if (archetypeKey.inMotionPass)
{
components.Add(ComponentType.ReadWrite<BuiltinMaterialPropertyUnity_MatrixPreviousM>());
}
if (archetypeKey.hasLocalToWorldOffset)
{
components.Add(ComponentType.ReadWrite<LinkedTransformLocalToWorldOffsetComponent>());
}
}
if (archetypeKey.hasLodGroup)
{
components.Add(ComponentType.ReadWrite<DrakeRendererVisibleRangeComponent>());
components.Add(ComponentType.ReadWrite<LODRange>());
components.Add(ComponentType.ReadWrite<LODWorldReferencePoint>());
if (!archetypeKey.isStatic)
{
components.Add(ComponentType.ReadWrite<MeshLODComponent>());
}
}
else
{
components.Add(ComponentType.ReadWrite<DrakeRendererLoadRequestTag>());
}
if (archetypeKey.isTransparent)
{
components.Add(ComponentType.ReadWrite<DepthSorted_Tag>());
}
if (archetypeKey.lightProbeUsage == LightProbeUsage.BlendProbes)
{
components.Add(ComponentType.ReadWrite<BlendProbeTag>());
}
else if (archetypeKey.lightProbeUsage == LightProbeUsage.CustomProvided)
{
components.Add(ComponentType.ReadWrite<CustomProbeTag>());
}
if (archetypeKey.hasShadowsOverriden)
{
components.Add(ComponentType.ReadWrite<ShadowsChangedTag>());
}
#if UNITY_EDITOR
#if DEBUG
components.Add(ComponentType.ReadWrite<CullingDistancePreviewComponent>());
#endif
if (!UnityEditor.EditorPrefs.GetBool("showEntities", false)) {
components.Add(ComponentType.ReadWrite<EntityGuid>());
}
components.Add(ComponentType.ReadWrite<EditorRenderData>());
#endif
var archetype = entityManager.CreateArchetype(components.AsNativeArray());
components.Dispose();
return archetype;
}
This reduces runtime operations significantly.
Scene parts hide/show
In the “Two-Way Link” section, I mentioned that entities access is managed per LOD hierarchy. That isn't always optimal. In game, we handle several map region changes, that involve hiding or showing many such hierarchies, leading to numerous small operations for each action and and requires many LinkedEntitiesAccess
MonoBehaviours
.
To address it, we introduced SharedLinkedEntitiesAccess
: if available in a parent, we register entities to it instead of creating a new LinkedEntitiesAccess
.
Merged drake
We have a lot static drakes, each renderer and LOD group requires short living MonoBehaviour
. That mean a lot of small allocations.
Lot of small allocations is definition of bad memory management.
Solution is fairly simple:
At build:
- Collect all static Drakes from scene
- Create a single
MergedDrake
GameObject
with GUID - Collect all data required to spawn collected Drakes
- Serialize data as binary into file in
StreamingAssets
, file is addressed by GUID - Remove processed Drakes
At runtime:
- Load data into
Temp
native allocation - Register all meshes and materials
- Spawn Drakes via a bursted job
- Release data
This requires only one MonoBehaviour
with single field (GUID) and, as a bonus, spawning is now bursted. Big win for simple batching.
Closing
As you can see, even a 'simple' "runtime entity renderer registration system" can be complex.
At the same time, you may notice that even a straightforward system, when scaled, reveals new optimization opportunities.
To see it in action, try playing Tainted Grail: The Fall of Avalon and count how many rendering is going on (be aware there is no occulusion culling) :)