You could build the content structure as a single root node and then have multiple Language homepage nodes directly beneath the content root.
To assign a language, you could create a custom datatype that simply displays all the .Net cultures, e.g. en-GB, fr-FR etc. Include that data type as a field on the language homepage document type and then output this value in the markup on the homepage and each descendant.
In the Language homepage document type, you can add a textstring property called 'umbracoUrlName'. You can then use this property to override the Url name. E.g. So you could call the page www.domain.com/en/ instead of www.domain.com/en/english-home/
With regards to duplicating the site at a later date, this is a difficult one. If the links are created using data types like the media picker and uComponent's multi node tree picker, then you will have no option but to inherit the links from the copied branch. However, if the links are created dynamically in the Razor or XSLT, then you should be able to make the links relative to the Language homepage or the current page. E.g. in XSLT getting the children of the parent language homepage would be something like $currentPage/ancestors-or-self::* [@level = '2']/child::*
. In other words you can avoid hard coding links by using a clever bit of relative traversal.