[Date Prev][Date Next][Thread Prev][][Date Index][Thread Index]

Re: character issue



Katsumi Yamaoka <yamaoka@xxxxxxx> writes:

> Hi,

Hello, thanks for your reply.

> How does your speech synth speak the following apostrophe?
>
> Freedom’s Watch

It speaks it as if it weren't there.  The synth acts as if there is a
space there.  Spelling it out, it looks like this:

Freedom s Watch

But that is not so much a problem.  _Much_ better than this:

Freedom question mark s Watch

> Since I found many web sites that offer the sentence that you
> cited using utf-8 charset, I tried writing this message with
> utf-8.  If the synth speaks it wrongly, I guess it does not
> support such a character.  

The page I was viewing 

http://www.nytimes.com/2007/09/30/us/politics/30watch.html?ex=1348891200&en=02eb54b65d042599&ei=5124&partner=permalink&exprod=permalink

has this line in the source:

<meta http-equiv="Content-Type" content="text/html; charset=iso8859-15">

but setting the variables to this does not help.

I don't think it is so much an issue with the synth.  As I understand
it, the synth speaks what is on the screen (with some exceptions like u
umlaut and c cedilla which are not characters in English).  Here is
another example that might help.  If I type the word "don't", it is
spoken correctly.  If I were to encounter it in utf-8, it would sound
like "don tee".  Still, this is preferable to "don question mark tee".  

> But if it does, the problem will be due to emacs-w3m, or your
> configuration.  In the latter case, we probably need to know the whole
> contents of your .emacs file and .emacs-w3m.el file, and the locale
> information in your system.  Because I have never encountered such a
> problem nowadays.  Though, I have no idea what to do with them for the
> moment.

This could be an issue with the configuration as it pertains to the
character set used in emacs itself.  The synth, as I understand it,
chokes on multibyte characters.  emacspeak requires unibyte to be on.  I
don't really understand the whole issue with character sets, so I cannot
really speak knowledgeably here.  

The thing about this issue I don't understand is that emacs/w3 displays
the page with the octal representation of the character.  I am afraid I
was not clear before.  I do not "see" the octal character but only the
octal representation of the character ("\xyz" instead of "'").

> Otherwise, we can offer a filter program that converts the
> apostrophe in question into the ASCII character for emacs-w3m.

While this sort of thing would be appreciated, it seems like a bad fix.
I might ask on the emacspeak list to see if anyone else is seeing this
and if so what they are doing about it.

BTW, I did try to set the following variables to ascii a few months ago:

w3m-input-coding-system 
w3m-output-coding-system 

but the result was the same.

Thanks again for your help,
rdc
-- 
Robert D. Crawford                                      rdc1x@xxxxxxxxxxx

If this fortune didn't exist, somebody would have invented it.