It actually sounds like its looking for the audio from the still picture rather than the video in the box. I'll have to experiment to see if I can replicate the scenario.
EDIT: just as I suspected. The limitation of the app requires the audio to be on the footage in position 1 on a picture in picture scenario. I set up a sample PnP with a still in position 1 and the video in position 2. I got no sound the same as you. If I reverse the positions, the sound from the video plays.
However handy these apps are, they are not as fully featured as the PC or Mac versions, only allowing for basic editing.