Category Archives: Knowledge Base

Privacy in the Metaverse

Or: How to install any app on the Quest 3 without giving Meta your phone number.

With the long anticipated Apple Vision Pro become available at February 2nd 2024 (unfortunately only in the US), we’ll finally see Apple’s take on a consumer-ready headset for mixed reality – er … I meant to say spatial computing. Seamless video see-through and hand tracking – what a technological marvel.

As of now, the closest alternative to the Vision Pro, those unwilling to spend $3500 or located outside the US, seems to be the Meta Quest 3. And this only at a fraction of the price, at $500. But unlike Apple, Meta is less known for privacy-aware products. After all, it is their core business model to not be.

This post explains how to increase your privacy on the Quest 3, in four easy steps.

Continue reading Privacy in the Metaverse

C++ Weihnachtsspecial

English summary: This is a holiday special for my German C++ beginners book, hence this post is in German.

Dieser Blogpost erweitert Ihr Wissen aus dem Buch C++ Schnelleinstieg: Programmieren lernen in 14 Tagen. Nach dem Kapitel 9 (Fortgeschrittene Konzepte) haben Sie alle nötigen Grundlagen, die Sie als Voraussetzung für das Special benötigen:

  • Grundlagen der Objektorientierung
  • Bibliotheken mittels vcpkg einbinden
  • GUI-Programmierung mit Nana
  • Callbacks und Lambdas
  • Fehler werfen

In diesem Projekt soll ein Weihnachtsbaum programmiert werden, bei dem Sie die Kerzen per Mausklick anzünden können. So sieht das ganze aus, wenn es fertig ist:

Das Codebeispiel beginnt in der main-Funktion des Programms und fügt nach und nach zusätzliche Komponenten oberhalb ein – daher beginnt die Zeilennummerierung so wie sie in der fertigen Datei sein wird (Downloadlink am Ende des Beitrags). Der Code wird durch Erklärungen unterbrochen, wie Sie an den fortlaufenden Zeilennummern erkennen können.

int main()
{
  nana::paint::image tree("tree.png");
  if (tree.empty())
  {
    /* Sie müssen die drei Bilder (tree.png, candleOn.png, candleOff.png) in den
       Build-Ordner kopieren. Sie finden diesen, indem Sie in Visual Studio im
       Projektmappen-Explorer einen Rechtsklick auf das Projekt durchführen und
       "Ordner in Datei-Explorer öffnen" auswählen.
    */
    throwException("Konnte das Baumbild nicht laden!");
  }

Die Funktion throwException wird später noch implementiert. Die für das Programm benötigten drei Bilder des Baums, der erloschenen und der angezündeten Kerze finden Sie zusammen mit der fertigen Codedatei am Ende dieses Beitrags. Nun geht es um die grundlegende Struktur: Das Fenster sowie die Kerzen.

  // Ein Nana-Formular als Basis des Fensters anlegen
  nana::form window(nana::API::make_center(tree.size().width, tree.size().height));
  window.caption("C++ Weihnachtsspecial 2021");

  // Jede Codezeile beim Anlegen der Liste entspricht einem Stockwerk des Baumes
  std::vector<Candle> candles = { Candle(290, 200),
    Candle(200, 300), Candle(300, 330), Candle(400, 315),
    Candle(120, 430), Candle(250, 450), Candle(350, 450), Candle(480, 450),
    Candle(70, 580), Candle(170, 620), Candle(290, 630), Candle(420, 625), Candle(540, 600) };

Die Kerzen werden als Objekte modelliert und mit x- und y-Koordinaten angelegt. Die Klassendefinition folgt gleich, nachdem der Rest der main-Funktion besprochen wurde. Folgen Sie solange dem im Abschnitt 4.3 des Buches erläuterten Prinzip des “Wishful Programmings” und nehmen Sie einfach an, Sie hätten schon eine solche Klasse, die Ihnen alle Wünsche erfüllt. Nun sollen die Grafiken alle gezeichnet werden:

  // Zeichnen des Baumes und der Kerzen
  nana::drawing drawing(window);
  drawing.draw([&](nana::paint::graphics& graphics)
  {
    tree.paste(graphics, nana::point(0, 0));
    for (Candle& candle : candles)
    {
      candle.draw(graphics);
    }
    graphics.string(nana::point(10, static_cast<int>(window.size().height) - 20),
      "C++ Schnelleinstieg: Programmieren lernen in 14 Tagen. Philipp Hasper, 2021");
  });

Das Zeichnen der weihnachtlichen Szenerie übernimmt die Nana-Klasse nana::drawing innerhalb einer Lambdafunktion. Sie sehen hier zwei praktische Methoden in Aktion:

  • tree.paste() kopiert das Bild des Baumes an eine bestimmte Stelle – hier an die Stelle (0,0) was der linken oberen Ecke entspricht. Hierfür muss auch noch das graphics-Object übergeben werden, welches aus dem Parameter der Lambdafunktion kommt.
  • graphics.string() schreibt einen Text an eine bestimmte Stelle. Hier soll es in die linke untere Ecke, daher muss für die y-Koordinate von der Höhe des Fensters genug Platz abgezogen werden, sodass der Text noch hinpasst.

Zum Schluss fehlen noch die Klickereignisse. Da die Kerzen als Grafiken gezeichnet wurden, haben sie kein automatisches Klick-Event wie zum Beispiel eine Schaltfläche. Daher muss jeder Klick ins Fenster abgefangen und darauf überprüft werden, ob seine Koordinaten innerhalb einer Kerze liegen:

  // Jeden Klick überprüfen, ob er eine Kerze trifft
  window.events().click([&](const nana::arg_click& event) {
    if (event.mouse_args == nullptr)
    {
      // Event kann nicht verarbeitet werden
      return;
    }
    for (Candle& candle : candles)
    {
      if (candle.isClicked(event.mouse_args->pos))
      {
        candle.toggle();
        /* Hier könnte man mit einem return die Schleife verlassen.
           Da sich aber bei falscher Platzierung die Kerzen überlagern
           könnten, wird die Schleife noch zu Ende geführt.
        */
      }
    }
    // Zeichnung wiederholen, sodass das Bild aktualisiert wird.
    drawing.update();
  });

  // Fenster anzeigen und Nana starten
  window.show();
  nana::exec();
  return 0;
}

Ob eine Kerze geklickt wurde, wird innerhalb der Candle-Klasse überprüft. Auch die Reaktion auf einen Klick ist in dieser Klasse implementiert. Fügen Sie daher diese Klasse oberhalb der main-Funktion ein:

// Klasse zum Zeichnen einer Kerze in zwei Zuständen (angezündet, ausgeblasen)
class Candle {
private:
  nana::paint::image candleOn;
  nana::paint::image candleOff;
  bool isOn = false;
  nana::rectangle rectangle;
public:
  Candle(int centerX, int centerY)
    : candleOn("candleOn.png"),
      candleOff("candleOff.png")
  {
    if (candleOn.empty() || candleOff.empty())
    {
      throwException("Konnte die Kerzenbilder nicht laden!");
    }
    if (candleOn.size() != candleOff.size())
    {
      throwException("Kerzenbilder haben nicht die gleiche Größe!");
    }
    /* Die an den Konstruktor übergebene x - und y - Koordinate sollen das Zentrum
       der Kerze angeben, daher müssen Sie für das Ziel-Rechteck umgerechnet werden.
     */
    nana::size size = candleOn.size();
    rectangle = nana::rectangle(centerX - size.width/2,
                                centerY - size.height/2,
                                size.width,
                                size.height);
  }

  void draw(nana::paint::graphics& graphics)
  {
    if (isOn)
    {
      candleOn.paste(graphics, rectangle.position());
    }
    else
    {
      candleOff.paste(graphics, rectangle.position());
    }
  }

  bool isClicked(const nana::point& point)
  {
    // Diese Methode testet, ob der Punkt innerhalb des Rechtecks liegt
    return rectangle.is_hit(point);
  }

  void toggle()
  {
    isOn = !isOn;
  }
};

Diese Klasse handhabt zwei verschiedene Bilder, die an einem Punkt zentriert angezeigt werden sollen. Abhängig vom Zustand (isOn), wird das eine oder das andere Bild gezeichnet. Zur einfacheren Platzierung wird beim Erzeugen des Objektes das gewünschte Zentrum der Kerze angegeben, was dann innerhalb des Konstruktors in ein entsprechendes Rechteck umgerechnet werden muss.

Nun fehlt noch eine Hilfsfunktion, die eine Fehlermeldung auf der Konsole ausgibt und dann einen Fehler wirft. Fügen Sie diese am Anfang der Datei, nach den include-Befehlen ein:

// Funktion, einen Fehler auf der Konsole ausgibt und dann einen Fehler wirft
void throwException(const std::string& msg)
{
  std::cout << msg << std::endl;
  throw std::exception(msg.c_str());
}

Und das war auch schon das ganze Programm. Frohe Feiertage und einen guten Rutsch! Den vollständigen Code und die Bilder können Sie hier herunterladen:

Jekyll plugin to bundle zip archives

The website for my recently published C++ book, cpp.hasper.info (German), was made with the static site generator Jekyll. It contains additional information, such as the code of all projects inside the book, as well as the sample solution for the exercises at the end of each chapter.

The reason why I used a static site generator in the first place was that I had all the code files organized in a folder, equipped with a CMake file which made sure the projects compile and are statically analyzed (I used cppcheck and cpplint, albeit with a very reduced set of checks due to the nature of the code examples). In order to not destroy this automation by copy-pasting code into a CMS, the site had to be generated around the code files.

I also wanted to enable the download of all C++ files as zip archive on a per-chapter basis. Again – I did not want to manually create this archive, in case I had to change some code in the future. So I wrote a Jekyll plugin which bundles given files into a zip archive which then can be placed behind a download link.

How it is used:

Filenames as multiple parameters:

{% zip archiveToCreate.zip file1.txt file2.txt %}

Spaces in filenames:

{% zip archiveToCreate.zip file1.txt folder/file2.txt 'file with spaces.txt' %}

A variable to contain a list of files is also possible:

{% zip ziparchiveToCreate.zip {{ chapter_code_files }} %}

The plugin code:

The plugin can be found here: https://github.com/PhilLab/jekyll-zip-bundler

# frozen_string_literal: true

# Copyright 2021 by Philipp Hasper
# MIT License
# https://github.com/PhilLab/jekyll-zip-bundler

require 'jekyll'
require 'zip'
# ~ gem 'rubyzip', '~>2.3.0'

module Jekyll
  # Valid syntax:
  # {% zip archiveToCreate.zip file1.txt file2.txt %}
  # {% zip archiveToCreate.zip file1.txt folder/file2.txt 'file with spaces.txt' %}
  # {% zip {{ variableName }} file1.txt 'folder/file with spaces.txt' {{ otherVariableName }} %}
  # {% zip {{ variableName }} {{ VariableContainingAList }} %}
  class ZipBundlerTag < Liquid::Tag
    VARIABLE_SYNTAX = /[^{]*(\{\{\s*[\w\-.]+\s*(\|.*)?\}\}[^\s{}]*)/mx.freeze
    CACHE_FOLDER = '.jekyll-cache/zip_bundler/'

    def initialize(tag_name, markup, tokens)
      super
      # Split by spaces but only if the text following contains an even number of '
      # Based on https://stackoverflow.com/a/11566264
      # Extended to also not split between the curly brackets of Liquid
      # In addition, make sure the strings are stripped and not empty
      @files = markup.strip.split(/\s(?=(?:[^'}]|'[^']*'|{{[^}]*}})*$)/)
                     .map(&:strip)
                     .reject(&:empty?)
    end

    def render(context)
      # First file is the target zip archive path
      target, files = resolve_parameters(context)
      abort 'zip tag must be called with at least two files' if files.empty?

      zipfile_path = CACHE_FOLDER + target
      FileUtils.makedirs(File.dirname(zipfile_path))

      # Create the archive. Delete file, if it already exists
      File.delete(zipfile_path) if File.exist?(zipfile_path)
      Zip::File.open(zipfile_path, Zip::File::CREATE) do |zipfile|
        files.each do |file|
          # Two arguments:
          # - The name of the file as it will appear in the archive
          # - The original file, including the path to find it
          zipfile.add(File.basename(file), file)
        end
      end
      puts "Created archive #{zipfile_path}"

      # Add the archive to the site's static files
      site = context.registers[:site]
      site.static_files << Jekyll::StaticFile.new(site, "#{site.source}/#{CACHE_FOLDER}",
                                                  File.dirname(target),
                                                  File.basename(zipfile_path))
      # No rendered output
      ''
    end

    def resolve_parameters(context)
      # Resolve the given parameters to a file list
      target, files = @files.map do |file|
        next file unless file.match(VARIABLE_SYNTAX)

        # This is a variable. Look it up.
        context[file]
      end

      [target, files]
    end
  end
end

Liquid::Template.register_tag('zip', Jekyll::ZipBundlerTag)

How to remove Unity Services from your project

I recently found myself in the situation that a Unity project which changed ownership, still linked to now non-existing Unity Services. It constantly gave me warning messages like:

Unable to access Unity services. Please log in or request membership to this project to use these services

These were only warnings, so nothing particularly dangerous, but still I wanted to get rid of them. As it took me multiple iterations to solve it and there are multiple unsolved or incomplete post about it, I wanted to share here how I finally managed to solve this problem:

  1. Navigate to the file <Your project folder>/ProjectSettings/ProjectSettings.asset and open it with a text editor
  2. In the section cloudServicesEnabled, set all to 0
  3. Set the cloudProjectId empty (i.e. a space after the colon)

Especially the space part took me a while to figure it out. The respective sections of the Unity project’s settings file now look similar to this:

...  
  vrEditorSettings:
    daydream:
      daydreamIconForeground: {fileID: 0}
      daydreamIconBackground: {fileID: 0}
  cloudServicesEnabled:
    UNet: 0
  luminIcon:
    m_Name: 
    m_ModelFolderPath: 
    m_PortalFolderPath: 
  luminCert:
    m_CertPath: 
    m_SignPackage: 1
  luminIsChannelApp: 0
  luminVersion:
    m_VersionCode: 1
    m_VersionName: 
  apiCompatibilityLevel: 6
  cloudProjectId: 
  framebufferDepthMemorylessMode: 0
...

Having these modifications in the project settings made the warning disappear, as the unused cloud services are now properly removed from the project. However, given that Unity sometimes changes their framework and configurations, this might get out-dated some day in the near future. If this doesn’t work anymore when you are trying it, please drop me a message, so I can retry and update the article. But so far, the warning has not popped up again in any of my Unity projects.

Custom backgrounds in Microsoft Teams video calls

Update: In newer versions of Teams, a button has been added to add custom images: Just go to the Background settings and click “+ Add New”.

Microsoft Teams allows you to blur or replace the background with pre-defined images. However, there is no user interface which allows to upload custom images.

It turns out that it is just the upload button which is missing – if you know where the folder lies from which Teams fetches the backgrounds, you can easily place your own content in there.

  • On Windows, this is: %AppData%\Microsoft\Teams\Backgrounds\Uploads, i.e. C:\Users\<your user name>\AppData\Roaming\Microsoft\Teams\Backgrounds\Uploads
  • On Mac, this is: /users/<your user name>/Library/Application Support/Microsoft/Teams/Backgrounds/Uploads

This allows for some creative effects:

To subtle? What about this:

P.S.: One nifty trick to bewilder your colleagues: Take a screenshot during the next video call, use Gimp to retouch the person from the image and use their room as your background

(extra blur added for this screenshot)

How we organize a small development team with minimal overhead in Gitlab

At ioxp we develop a novel way of sharing knowledge using Augmented Reality, AI and Computer Vision to change the way how industries operate. Our ecosystem consists of many different tools and technologies – the AI backend, Hololens apps , Android apps, a web-based editor and cloud services – so we are very dependent on an efficient way to organize our developers and their different, often overlapping sub-teams.

And after all we are a startup and engineers by heart – we have to keep the balance between a thorough designed process and the creative freedom needed to explore and create new things.

This post will concentrate on how we facilitate developer communication and team work using features of the DevOps tool Gitlab, factoring out all other important stages like product management, planning or sprint planning. It is an excerpt from our internal Handbook (naturally hosted as Wiki in Gitlab) and describes the process I have designed and implemented together with my team.

The development process is primarily guided by issues and their three most important attributes: labels, assignees and related merge requests. It uses as little as 10 different issue labels and is divided into six stages:

Continue reading How we organize a small development team with minimal overhead in Gitlab

How we keep an eye on our internal infrastructure with Gitlab – automated!

At ioxp, Gitlab is our tool of choice to organize our development. We also use their Wiki tool for our employee handbook and this is where we also have a list of all our internal services to maintain an overview about what is deployed, why and where. It looks like this:

To a) check that every service listed there is reachable and thus b) checking that the list is up-to-date (at least in one direction), we added a special Gitlab CI job which curl’s them all. This is its definition inside our main project’s .gitlab-ci.yml:

The script

Internal IT Check:
  only:
    variables:
      - $CI_CUSTOM_ITCHECK
  needs: []
  variables:
    GIT_STRATEGY: none
    WIKI_PAGE: IT-Address-List.md
  before_script:
    - apk add --update --no-cache git curl
    - git --version
    - git clone https://gitlab-ci-token:${CI_JOB_TOKEN}@${CI_SERVER_HOST}/wiki/wiki.wiki.git
  script:
    # Scrape the URLs from the IT wiki page
    - cd wiki.wiki
    - |
      urls="$(cat $WIKI_PAGE | grep -Ev '~~.+~~' | grep -Eo '(http|https)://([-a-zA-Z0-9@:%_+.~#?&amp;/=]+)' | sort -u)"
      if [ -z "$urls" ]; then echo "No URLs found in Wiki"; exit 1; fi
    - echo "Testing URLs from the Wiki:" &amp;amp;&amp;amp; echo "$urls"
    # Call each URL and see if it is available. set +e is needed to not exit on subshell errors. set -e is to revert to old behavior.
    # Special case: As one of our services does not like curls, we have to set a user agent
    - set +e
    - failures=""
    - for i in $urls ; do result="$(curl -k -sSf -H 'User-Agent:Mozilla' $i 2>&amp;amp;1 | cut -d$'\n' -f 1)"; if [ $? -ne 0 ]; then failures="$failures\n$i - $result"; fi; done
    - set -e
    - if [ -z "$failures" ] ; then exit 0; fi
    # Errors happened
    - |
      errorText="ERROR: The following services listed in the Wiki are unreachable. Either they are down or the wiki page $WIKI_PAGE is out-dated:"
      echo -e "$errorText\033[0;31m$failures\033[0m"
      curl -X POST -H "Content-Type: application/json" --data "{ 'pretext': '${CI_JOB_NAME} failed: ${CI_JOB_URL}', 'text': '$errorText$failures', 'color': 'danger' }" ${CI_CUSTOM_SLACK_WEBHOOK_IT} 2>&amp;amp;1
      exit 1

Script Breakup

  1. Gitlab wikis are a git repository under the hood. You can clone them via <projectUrl>.wiki.git (The URL can also be retrieved by clicking on “Clone repository” at the upper right on the wiki main page
  2. The URLs are extracted by grepping the respective markdown page for http/https links: grep -Eo '(http|https)://([-a-zA-Z0-9@:%_+.~#?&/=]+)'
  3. To ignore deprecated links (strikethrough in the wiki), we exclude them before with grep -Ev '~~.+~~' and to filter out duplicates we use sort -u afterwards
  4. Each found URL is then called via curl:
    • -k disables the TLS certificate verification if you use self-signed certificates and haven’t installed your own root CA.
    • -H 'User-Agent:Mozilla' was needed because one of our services didn’t like to be curl’ed by a non-browser so we dress up as one.
  5. With set +e we prevent that a subshell error (i.e. the curl) makes the Gitlab job exit with a failure prematurely as we want a list of all errors, not just the first one.
  6. If any error is found, a post containing the list of unavailable services is sent to our IT Slack channel using their webhooks.

Gitlab Preconditions

To run this job in a fixed interval, we set up a scheduled pipeline, running every day at two in the afternoon. The schedule has the variable CI_CUSTOM_ITCHECK defined so the job’s only: selector fires (see in the script above). All other jobs of our pipeline are either defined to not run under schedules or not when the variable is defined (using the except: selector).

Gitlab Icon

Gitlab pipelines only/except for WIP merge requests

At ioxp we automate our continuous integration using the awesome Gitlab project. However, their model of when testing jobs run on commits induced quite a heavy load on our testing servers. So we were very delighted when pipelines for merge requests were introduced in Gitlab 11.6.

We then configured our testing to run with

only:merge_requests

However, our development process (more on that in a later post) recommends:

If a developer starts working on an issue, they have to

1. (if not already done) assign them
2. Create a branch starting with their two-letter initials, like ph/purposeOfTheBranch,
3. Open a new WIP merge request mentioning the issue to mark that it is currently being worked on. As the Merge Request window even offers a small button to automatically append this prefix, the workflow is quite easy here.
4. Assign themselves to the merge request because they are currently responsible for it

We do this to have a nice overview about what is currently going on and to discuss about unfinished work. However, our only:merge_requests became useless now – as every implementation starts with a merge request, the pipelines are immediately started wasting CI time.

Luckily, variables expressions came to our rescue:

integrationtest:
  stage: integrationtest
  tags:
    - withBackend
  only:
    - merge_requests
  except:
    variables:
      - $CI_MERGE_REQUEST_TITLE =~ /^WIP:.*/

It is as easy as that. The expression filters the merge requests by their title and the job only runs when it does not start with the WIP prefix.

Update: Newer GitLab versions use the “Draft:” prefix instead of “WIP:” by default, so to support both notations, the script needs to make use of the OR operator like this:

  stage: integrationtest
  tags:
    - withBackend
  only:
    - merge_requests
  except:
    variables:
      - $CI_MERGE_REQUEST_TITLE =~ /^WIP:.*/ || $CI_MERGE_REQUEST_TITLE =~ /^Draft:.*/

Worry-free text writing into OpenCV images

Using cv::putText is cumbersome and placing your text at the correct position with the correct size is hard. Here is a wrapper function dealing with all of this for you. The text is fitted inside the given image, even multiple lines are possible and everything is nicely centered.

void ioxp::putText(cv::Mat imgROI, const std::string &text, const int fontFace = cv::FONT_HERSHEY_PLAIN,
    const cv::Scalar color = cv::Scalar::all(255), const int thickness = 1, const int lineType = cv::LINE_8)
{
    /*
     * Split the given text into its lines
     */
    std::vector<std::string> textLines;
    std::istringstream f(text);
    std::string s;
    while (std::getline(f, s, '\n')) {
        textLines.push_back(s);
    }

    /*
     * Calculate the line sizes and overall bounding box
     */
    std::vector<cv::Size> textLineSizes;
    cv::Size boundingBox(0,0);
    int baseline = 0;
    for (std::string line : textLines) {
        cv::Size lineSize = cv::getTextSize(line, fontFace, 1, thickness, &baseline);
        baseline += 2 * thickness;
        lineSize.width += 2 * thickness;
        lineSize.height += baseline;
        textLineSizes.push_back(lineSize);
        boundingBox.width = std::max(boundingBox.width, lineSize.width);
        boundingBox.height += lineSize.height;
    }

    const double scale = std::min(imgROI.rows / static_cast<double>(boundingBox.height),
                                  imgROI.cols / static_cast<double>(boundingBox.width));
    boundingBox.width *= scale;
    boundingBox.height *= scale;
    baseline *= scale;
    for (size_t i = 0; i < textLineSizes.size(); i++) {
        textLineSizes.at(i).width *= scale;
        textLineSizes.at(i).height *= scale;
    }
    /*
     * Draw the text line-by-line
     */
    int y = (imgROI.rows - boundingBox.height + baseline) / 2;
    for (size_t i = 0; i < textLines.size(); i++) {
        y += textLineSizes.at(i).height;
        // center the text horizontally
        cv::Point textOrg((imgROI.cols - textLineSizes.at(i).width) / 2, y - baseline);
        cv::putText(imgROI, textLines.at(i), textOrg, fontFace, scale, color, thickness, lineType);
    }
}

This is how you use it and how the results look like:

    cv::Mat outputImage(360, 640, CV_8UC3);
    outputImage.setTo(0);
    ioxp::putText(outputImage, "Short text");
    cv::imshow("text", outputImage);
    cv::waitKey(0);

    cv::Mat outputImage(360, 640, CV_8UC3);
    outputImage.setTo(0);
    ioxp::putText(outputImage,
      "Some longer text, even with\nmultiple lines spread over the whole image");
    cv::imshow("text", outputImage);
    cv::waitKey(0);

    cv::Mat outputImage(360, 640, CV_8UC3);
    outputImage.setTo(0);
    ioxp::putText(outputImage, "\n\n\nEmpty\n\n\nLines\n\n\n");
    cv::imshow("text", outputImage);
    cv::waitKey(0);

By using the Rectangle accessor you can define exactly, which part of the image the text should be placed in:

    cv::Mat outputImage(360, 640, CV_8UC3);
    outputImage.setTo(0);
    ioxp::putText(outputImage(cv::Rect(0, outputImage.rows / 10, outputImage.cols, outputImage.rows / 10)),
      "Text placed in the upper\n10 percent of the image");
    cv::imshow("text", outputImage);
    cv::waitKey(0);

Combining ARCore tracking and Cardboard Spatial Audio

This week Google released ARCore, their answer to Apple’s recently published Augmented Reality framework ARKit. This is an exciting opportunity for mobile developers to enter the world of Augmented Reality, Mixed Reality, Holographic Games, … whichever buzzword you prefer.

To get to know the AR framework I wanted to test how easy it would be to combine it with another awesome Android framework: Google VR, used for their Daydream and Cardboard platform. Specifically, its Spatial Audio API. And despite never having used one of those two libraries, combining them is astonishingly simple.

Cf. to https://developers.google.com/vr/concepts/spatial-audio

The results:

The goal is to add correctly rendered three dimensional sound to an augmented reality application. For a demonstrator, we pin an audio source to each of the little Androids placed in the scene.
Well, screenshots don’t make sense to demonstrate audio but without them this post looks so lifeless 🙂 Unfortunately, I could not manage to do a screen recording which includes the audio feed.

The how-to:

  1. Setup ARCore as explained in the documentation. Currently, only Google Pixel and the Samsung Galaxy S8 are supported so you need one of those to test it out. The device coverage will increase in the future
  2. The following step-by-step tutorial starts at the sample project located in /samples/java_arcore_hello_ar it is based on the current Github repository’s HEAD
  3. Open the application’s Gradle build file at /samples/java_arcore_hello_ar/app/build.gradle and add the VR library to the dependencies
    dependencies {
        ...
        compile 'com.google.vr:sdk-audio:1.10.0'
    }
    
  4. Place a sound file in the asset folder. I had some troubles getting it to work until I found out that it has to be a 32-bit float mono wav file. I used Audacity for the conversion:
    1. Open your Audio file in Audacity
    2. Click Tracks -> Stereo Track to Mono
    3. Click File -> Export. Select “Other uncompressed files” as type, Click Options and select “WAV” as Header and “Signed 32 bit PCM” as encoding

    I used “Sam’s Song” from the Ubuntu Touch Sound Package and you can download the correctly converted file here.

  5. We have to apply three modifications to the sample’s HelloArActivity.java: (1) bind the GvrAudioEngine to the Activity’s lifecycle, (2) add a sound object for every object placed into the scene and (3) Continuously update audio object positions and listener position. You find the relevant sections below.

    public class HelloArActivity extends AppCompatActivity implements GLSurfaceView.Renderer {
        /*
        ...
        */
        private GvrAudioEngine mGvrAudioEngine;
        private ArrayList&amp;amp;lt;Integer&amp;amp;gt; mSounds = new ArrayList&amp;amp;lt;&amp;amp;gt;();
        final String SOUND_FILE = "sams_song.wav";
    
        @Override
        protected void onCreate(Bundle savedInstanceState) {
            /*
            ...
             */
            mGvrAudioEngine = new GvrAudioEngine(this, GvrAudioEngine.RenderingMode.BINAURAL_HIGH_QUALITY);
            new Thread(
                new Runnable() {
                    @Override
                    public void run() {
                        // Prepare the audio file and set the room configuration to an office-like setting
                        // Cf. https://developers.google.com/vr/android/reference/com/google/vr/sdk/audio/GvrAudioEngine
                        mGvrAudioEngine.preloadSoundFile(SOUND_FILE);
                        mGvrAudioEngine.setRoomProperties(15, 15, 15, PLASTER_SMOOTH, PLASTER_SMOOTH, CURTAIN_HEAVY);
                    }
                })
            .start();
        }
    
        @Override
        protected void onResume() {
            /*
            ...
             */
            mGvrAudioEngine.resume();
        }
    
        @Override
        public void onPause() {
            /*
            ...
             */
            mGvrAudioEngine.pause();
        }
    
        @Override
        public void onDrawFrame(GL10 gl) {
            // Clear screen to notify driver it should not load any pixels from previous frame.
            GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT | GLES20.GL_DEPTH_BUFFER_BIT);
    
            try {
                // Obtain the current frame from ARSession. When the configuration is set to
                // UpdateMode.BLOCKING (it is by default), this will throttle the rendering to the
                // camera framerate.
                Frame frame = mSession.update();
    
                // Handle taps. Handling only one tap per frame, as taps are usually low frequency
                // compared to frame rate.
                MotionEvent tap = mQueuedSingleTaps.poll();
                if (tap != null &amp;amp;amp;&amp;amp;amp; frame.getTrackingState() == TrackingState.TRACKING) {
                    for (HitResult hit : frame.hitTest(tap)) {
                        // Check if any plane was hit, and if it was hit inside the plane polygon.
                        if (hit instanceof PlaneHitResult &amp;amp;amp;&amp;amp;amp; ((PlaneHitResult) hit).isHitInPolygon()) {
                            /*
                            ...
                             */
                            int soundId = mGvrAudioEngine.createSoundObject(SOUND_FILE);
                            float[] translation = new float[3];
                            hit.getHitPose().getTranslation(translation, 0);
                            mGvrAudioEngine.setSoundObjectPosition(soundId, translation[0], translation[1], translation[2]);
                            mGvrAudioEngine.playSound(soundId, true /* looped playback */);
                            // Set a logarithmic rolloffm model and mute after four meters to limit audio chaos
                            mGvrAudioEngine.setSoundObjectDistanceRolloffModel(soundId, GvrAudioEngine.DistanceRolloffModel.LOGARITHMIC, 0, 4);
                            mSounds.add(soundId);
    
                            // Hits are sorted by depth. Consider only closest hit on a plane.
                            break;
                        }
                    }
                }
                /*
                 ...
                 */
                // Visualize planes.
                mPlaneRenderer.drawPlanes(mSession.getAllPlanes(), frame.getPose(), projmtx);
    
                // Visualize anchors created by touch.
                float scaleFactor = 1.0f;
                for (int i=0; i &amp;amp;lt; mTouches.size(); i++) {
                    PlaneAttachment planeAttachment = mTouches.get(i);
                    if (!planeAttachment.isTracking()) {
                        continue;
                    }
                    // Get the current combined pose of an Anchor and Plane in world space. The Anchor
                    // and Plane poses are updated during calls to session.update() as ARCore refines
                    // its estimate of the world.
                    planeAttachment.getPose().toMatrix(mAnchorMatrix, 0);
    
                    // Update and draw the model and its shadow.
                    mVirtualObject.updateModelMatrix(mAnchorMatrix, scaleFactor);
                    mVirtualObjectShadow.updateModelMatrix(mAnchorMatrix, scaleFactor);
                    mVirtualObject.draw(viewmtx, projmtx, lightIntensity);
                    mVirtualObjectShadow.draw(viewmtx, projmtx, lightIntensity);
    
                    // Update the audio source position since the anchor might have been refined
                    float[] translation = new float[3];
                    planeAttachment.getPose().getTranslation(translation, 0);
                    mGvrAudioEngine.setSoundObjectPosition(mSounds.get(i), translation[0], translation[1], translation[2]);
                }
    
                /*
                 * Update the listener's position in the audio world
                 */
                // Extract positional data
                float[] translation = new float[3];
                frame.getPose().getTranslation(translation, 0);
                float[] rotation = new float[4];
                frame.getPose().getRotationQuaternion(rotation, 0);
    
                // Update audio engine
                mGvrAudioEngine.setHeadPosition(translation[0], translation[1], translation[2]);
                mGvrAudioEngine.setHeadRotation(rotation[0], rotation[1], rotation[2], rotation[3]);
                mGvrAudioEngine.update();
            } catch (Throwable t) {
                // Avoid crashing the application due to unhandled exceptions.
                Log.e(TAG, "Exception on the OpenGL thread", t);
            }
        }
        /*
         ...
         */
    }
    
    
  6. That’s it! Now, every Android placed into the scene also plays back audio.

Some findings:

  1. Setting up ADB via WiFi is really helpful as you will walk around a lot and don’t want to reconnect USB every time.
  2. Placing the Androids too close to each other will produce a really annoying sound chaos. You can modify the rolloff model to reduce this (cf. line 71 in the code excerpt above).
  3. It matters how you hold your phone (portrait with the current code), because ARCore measures the physical orientation of the device but the audio coordinate system is (not yet) rotated accordingly. If you want to use landscape mode, it is sufficient to set the Activity in the manifest to android:screenOrientation="landscape"
  4. Ask questions tagged with the official arcore tag on Stack Overflow, the Google developers are reading them!