Barbara Hinderer | March 16, 2021

Blogbeitrag von Andreas Erben, CEO, CTO, Microsoft MVP & Regional Director, Scrum Trainer, entrepreneur, advisor, public speaker

The current prevalent paradigm how to design and develop computer applications was mostly defined in the 60’s of the last millennium, over 50 years ago. The key innovations were the mouse and the touchscreen. The concept of the terminal with a graphical user interface using a mouse dates to the late 1960s.

For most of us this describes the reality how they spend most of their time interacting with information technology between phones and computers. This also leads to our thinking being culturally anchored in that paradigm and whenever conversations about designing applications occur, it is natural to come back to that paradigm including when talking about Mixed Reality applications.

Without guidance, many customers when they hear about Augmented Reality, Virtual Reality, Mixed Reality, Extended Reality, and other related keywords, and what they think about then is mostly placing two-dimensional information resembling a computer display somewhere in a three dimensional context. In some cases this indeed can be a completely valid approach but typically it represents a lost opportunity to think about applications, or to be precise, to think about the problems to solve, opportunities that represent themselves, and solutions to provide, in a different way.

I was fortunate to have been exposed to Virtual Reality technology and solutions already around the mid-1990s mostly in an academic context. It was by no means commercially viable for breadth adoption at that time, but already interesting concepts emerged. To me, the core difference to those days after starting to work with Microsoft HoloLens was the new ability to interact with our real environment in a different way and to realize that an IT system could become a participant in a more natural interaction with our environment. The core capability of that is sometimes called as “Perception” which enables the computing device through its sensors and algorithms to gain some understanding of what is happening in the real world.

One of the key opportunities is to not have to learn how to interact with a keyboard or a mouse to get to the result that we want, but have the IT system ideally naturally understand our intent as we do interact with our natural environment and present information there where it matters. We do not click a button if we want to take a bite into an apple, we just grab the apple and take that bite. Our natural interactions are driven by what our biology enables us to do. Of course, philosophically one can argue that this also anchors us and limits our choices to design systems as we could aim at enhancing our own old ways to grasp reality with new concepts, yet I believe at this point in time it is an important approach to put emphasis on working with our basic natural paradigms as this also is most inclusive of the experiences that all humans share and can lead to applications that share most common ground for humanity.

This is where Artificial Intelligence comes into play which is the core enabler to gain that understanding of the environment and also about user interactions. Microsoft HoloLens for example provides mechanisms to provide a coordinate space for our world and mechanisms to locate itself in that world as well as access to sensors including camera and microphone, that allow it to understand better first what is going on and then devising from sensory input and context, what should happen next, what could be the intent of the person using the device.

To balance the process to develop ideas to then try to tackle in a project with a customer, often called “envisioning”, it takes both an experienced technology partner with the right mixture of bold visions and solid understanding of technology, and a customer who is willing and able to commit to let those ideas flow and grow. If the customer is not committed to that, often frustration on both sides can occur with a very typical scenario being that the customer keeps going back to wanting to see specific 2-dimensional screens in a 3-dimensional world, possibly adding a 3D model of a product they are using, building, or selling. But at the same time they are then typically having expectations that it all should look great based on big-budget VR content they may have seen somewhere while costing as much as a 1 screen mobile form input app.

My guidance is to always raise the question where the delivered value is at each stage of the potential project may be, to avoid diverging expectations. Also, it immensely useful to introduce customers to some concepts in Mixed Reality applications through demos. Those could be third-party applications, it could even be games, as long as they tell the story of what is the key point to pay attention to as to “what matters”. An early example of applications for Microsoft HoloLens was the shooting game “RoboRaid”. The game is often suitable for a quick demonstration because it does not require much time to set up and let someone experience it. While it is easy for a customer to get immersed and get excited about the game, the story-telling that needs to accompany it plays an even bigger role. The key is the interaction with the environment. RoboRaid “scans” the environment for suitable spaces on the wall where enemy robots will appear and crawl on. This unlocks thinking about Spatial Applications.

In one of our own applications we developed an interactive animated avatar that is able to follow the user around and when commanded to do so, walks towards points of interest in space to guide a users attention. That application made intelligent use of space by analyzing the 3D environment and placing objects in that environment. The conversation with a customer then can be about all scenarios where a user of the application would be guided to points of interest in a real environment or to virtual objects placed in a real environment. Then the context can be expanded into mechanisms to interact with 3-dimensional space.

When it comes to understanding the environment, combining capabilities of the platform used, such as Microsoft HoloLens' world coordinate system, with cloud-based Microsoft Azure Spatial Anchors can unlock further value. In this case to be able to integrate with third-party AR/VR platforms and to be able to persist location easily. Considering that most AR platforms allow access to a regular camera (RGB), enables integrating general purpose computer vision capabilities in applications while being aware of the spatial context. Simplified this means to then precisely know what we were looking at, from what angle, at what point in time. Now, utilizing other services that add insight into digital images such as Microsoft Azure Cognitive Services Computer Vision, leads to being able to annotate three dimensional space, and analyze changes in observations over time. One example could be to compare the approximate location of common physical objects, such as furniture in a building over time and automatically detect relevant changes as to the position of objects.

This represents a relatively straightforward approach to build spatially aware applications that is accessible to development teams with average skillsets utilizing off-the-shelf technologies, specifically democratized Computer Vision and AI in the cloud.

Another of aspect of building applications for Mixed Reality concerns those development teams. A Mixed Reality application often requires skills distributed over team members with more diverse professional backgrounds compared to a typical Enterprise Applications. The development process is often more similar to producing a computer game, or an interactive media installation, and is in many regards has many things in common of what creative agencies do. We may find specialists for interactive design, for 3D interactions, for 3D modelling and optimization, creative directors, sound designers, and niche technical skills such as optimizing GPU shaders. For those teams to succeed, the team culture and organizational culture plays a significant role. To point out extremes, in a traditional engineering focused structure, creative talent may feel restricted and creativity choked, while a culture that is too similar to some creative agencies, may not result in emphasizing software engineering with long term maintainability. Some creative agencies for example excel to deliver a very appealing end result but their solution is often not maintainable for years of enterprise use. On the other extreme, the leadership style and communications in many software development companies that have a great track record in engineering excellence, may limit the creativity in their overall solution design and art.

We had to go through our own learning curve how to handle those interactions. One example is represented by the design process for the avatar mentioned above. When we received our first iteration of the 3D model of our avatar, we had to ask our character designer and 3D artist to make some changes. It was evident that this was almost hurtful for the artist which at first surprised us. We understood better as we looked at the process. The artist painstakingly cared for every detail of his artwork and painstakingly was hand-drawing the fur texture, the strands of hair, that would then be used on the avatar. It was his virtual “baby”, his creation. The sense of creative ownership and self-identification with their artwork output is for many artists a lot stronger than it would be how a software engineer views their output. The software engineer may easily shrug off a minor change request in a detail of his or her application or if someone comes up with a more efficient solution to a problem solved in the software implementation. Our learning included to improve our feedback process and while ensuring that communications between team members flow, that we adjust how we communicate necessary changes to team members in a way that is more appropriate to their background and their needs. For example, we typically seek “buy in” into the changes from the creative artist in a more holistic way, rather than just requesting to change an aspect of a design to get everybody on the same page and to pull in the same direction.

Finally, we noticed that our teams have gotten more diverse and our interactions typically cross geographic and cultural borders. It is not uncommon to have a team distributed over multiple time zones, countries, or continents. This itself presents sometimes its own challenges but mostly results in opportunities. We believe it is a rewarding experience and made us stronger, better, but also more human.

Overall we believe we succeeded in developing a mindset and approach to make both our customers and ourselves more successful when tackling Spatial Computing Applications.

#mixedreality #mr #spatialcomputing #vr #ar #virtualreality #augmentedreality