Mixed Reality
Blog post by Andreas Erben, CEO, CTO, Microsoft MVP & Regional Director, Scrum Trainer, Entrepreneur, Consultant, Speaker
The currently prevailing paradigm for designing and developing computer applications was primarily defined in the 1960s, over 50 years ago. The key innovations at that time were the Maus and the Touchscreen. The concept of the terminal with a graphical user interface using a mouse dates back to the late 1960s.
For most of us, this describes the reality of how we spend most of our time with information technology on smartphones and computers. This also results in our thinking being culturally anchored in this paradigm. When discussions about designing applications take place, it is natural to revert to this paradigm, even when talking about mixed reality applications.
When customers hear about augmented reality, virtual reality, mixed reality, extended reality, and other related keywords, most of them tend to think, without further explanation, of two-dimensional information resembling a computer display placed somewhere in a three-dimensional context. While this may be a reasonable approach in some cases, it generally represents a missed opportunity to think about applications, or more precisely, the problems to be solved and the resulting possibilities and new solutions.
I was fortunate to have been involved with virtual reality technologies and solutions in the mid-1990s, mainly in an academic context. At that time, the technology was far from commercially viable for widespread application, but interesting concepts were already emerging. For me, the main difference when I started working with Microsoft HoloLens was the new ability to interact with and perceive our real environment in a different way, realizing that an IT system can participate in a more natural interaction with our environment. This core function is sometimes referred to as “perception,” allowing the device to gain a certain understanding of what is happening in the real world through its sensors and algorithms.
One of the most significant opportunities is not having to learn how to interact with a keyboard or mouse to achieve the desired outcome, but rather having the IT system ideally understand our intention when we interact with our natural environment and present information where it matters. We don’t click a button when we want to bite into an apple; we simply reach for the apple and take a bite. Our natural interactions are determined by what our biology allows us to do. Of course, one can argue philosophically that this also anchors us and limits our choices for designing systems, as we could aim to improve our own conventional approach to capture reality with new concepts. However, I believe that currently, it is an important approach to focus on working with our fundamental natural paradigms, as this also encompasses the most comprehensive experiences shared by all humans. And that can lead to applications that benefit the vast majority of people.
Artificial Intelligence
In this context, artificial intelligence plays a crucial role in providing the foundation for understanding the environment and user interactions. For instance, Microsoft HoloLens offers mechanisms to provide a coordinate space for our world, locate ourselves within this world, and access sensors like cameras and microphones. These capabilities allow for a better understanding of the context and user intentions based on sensory inputs.
To facilitate the process of idea development and then embark on a project with a customer, often referred to as “envisioning,” it requires an experienced technology partner with the right blend of bold visions and solid technical expertise. It also requires a customer who is willing and capable of fostering these ideas and further expanding upon them. If the customer does not fully embrace the possibilities, frustrations can arise on both sides. A common scenario involves the customer repeatedly expecting two-dimensional screens in a three-dimensional world or demanding a 3D model of a product they use, build, or sell. At the same time, there is often an expectation for everything to look amazing based on high-budget VR content seen elsewhere, while still expecting the project to cost no more than a typical form input app for smartphones.
My advice is to always question where the value lies in each phase of a potential project to avoid diverging expectations. Additionally, it is highly beneficial to introduce customers to some concepts in mixed reality applications through demos. These can be third-party applications or even games, as long as they convey the essence of what matters. An early example of an application for Microsoft HoloLens was the shooter game. RoboRaid“. The game is often suitable for a quick demonstration as it doesn’t require much time to set up and get someone to experience it. While it’s easy for a customer to immerse themselves in the game and get excited about playing it, the accompanying narrative plays an even bigger role. The key is the interaction with the environment. RoboRaid “scans” the surroundings for suitable spots on the wall where enemy robots appear and crawl along. This allows for thinking about spatial applications.
In one of our own applications, we developed an interactive animated avatar that can follow the user and, upon command, move towards points in space to attract the user’s attention. This application intelligently utilized space by analyzing the 3D environment and placing objects within that environment. The conversation with a customer can then encompass all scenarios where a user of the application is guided to specific targets in a real environment or virtual objects in a real environment. Then, the context can be expanded to include mechanisms for interacting with the three-dimensional space.
To understand the environment, the functions of the platform being used, such as the world coordinate system of Microsoft HoloLens with cloud-based capabilities, can be employed. Microsoft Azure Spatial Anchors can be combined to achieve further benefits. In this case, third-party AR/VR platforms can be integrated, and the spatial location can be easily maintained. Since most AR platforms allow access to a regular camera (RGB), general computer vision capabilities can be integrated into applications while considering the spatial context. Simply put, it means knowing exactly what we have viewed from which perspective at a given time. The use of other services that provide insights into digital images, such as Computer Vision from Microsoft Azure Cognitive Services, This now allows for annotating three-dimensional spaces and analyzing changes in observations over time. An example could be comparing the approximate position of common physical objects like furniture in a building over time and automatically detecting relevant changes in object positions.
To understand the environment, the functions of the platform being used, such as Microsoft Azure Spatial Anchors, can be combined to achieve further benefits. In this case, third-party AR/VR platforms can be integrated, and the spatial location can be easily maintained. Since most AR platforms provide access to a regular camera (RGB), general computer vision capabilities can be integrated into applications while considering the spatial context. Simply put, it means knowing precisely what we have observed from which perspective at any given time. By leveraging other services that offer insights into digital images, such as Microsoft Azure Cognitive Services’ Computer Vision, three-dimensional spaces can be annotated, and changes in observations over time can be analyzed. For example, one could compare the approximate position of common physical objects like furniture in a building over time and automatically detect relevant changes in object positions.
This approach represents a relatively straightforward way to create applications with spatial awareness that development teams with average skills can access using standard technologies, particularly democratized computer vision and cloud-based AI.
Another aspect of creating applications for mixed reality involves the development teams themselves. A mixed reality application often requires skills distributed among team members with different professional backgrounds compared to typical enterprise applications. The development process often resembles the creation of a computer game or an interactive media installation and shares many similarities with what creative agencies do. In this field, specialists in interactive design, 3D interactions, 3D modeling and optimization, creative directors, sound designers, and technical niche skills like GPU shader optimization can be found. The team culture and organizational culture play an important role in the success of these teams. To illustrate extremes, in a traditionally engineering-focused structure, creative talents may feel constrained and creativity may be stifled, while a culture too similar to some creative agencies may not lead to software engineering that is sustainable in the long run. For example, some creative agencies may deliver a visually appealing end product, but the solution may not be viable for long-term use by the company. On the other hand, leadership style and communication in many software development companies, which have a track record of technical excellence, may restrict creativity in terms of overall solution design and aesthetics.
We have had to go through our own learning curve in dealing with these interactions. For example, in the design process for the avatar mentioned above, when we received our first iteration of the 3D model of the avatar, we had to ask our character designer and 3D artist to make some changes. It was obvious that this was almost hurtful for the artist, which initially surprised us. We better understood this when we looked at the process. The artist had carefully attended to every detail of their artwork, meticulously drawing the fur texture and individual strands of hair that were then used for the avatar. It was their virtual “baby,” their creation. The sense of creative ownership and self-identification with their artworks is much stronger for many artists compared to how a software developer may view their results. A software developer can more easily handle a request for a minor change in the details of their application or if someone finds a more efficient solution to a problem solved in the software implementation. Our understanding involved improving our feedback process and ensuring successful communication among team members, as well as adapting the way we convey the necessary changes to team members in a manner that better aligns with their background and needs. For example, we usually try to “embrace” the changes from the creative artist in a more holistic way, rather than simply demanding a specific aspect of a design to be altered, so that everyone can agree and move forward together.
Lastly, we have found that our teams have become more diverse, and our interactions generally transcend geographical and cultural boundaries. It is not uncommon for a team to be distributed across multiple time zones, countries, or continents. This in itself sometimes presents its own challenges but usually
#mixedreality #mr #spatialcomputing #vr #ar #virtualreality #augmentedreality