I don't have any similar experience with such a project but can tell you what I would go for if I were to do it.
I'd choose HTML and CSS for the front-end. As the database management platform, I'd go for MySQL. And to work with both of the sides, I'd use a server-side language like PHP. You can always add more pizzazz
As for tagging, I don't have an idea on how to integrate voices though. But check this
out. According to Face.com's official documentation, "We offer services for detecting, recognizing, and tagging faces in any photo, through our REST API"
. Would be interesting to give it a look.