Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

� / � Appearing below item aspects when pressing shift #508

Open
AngryScandinavian opened this issue Apr 26, 2018 · 24 comments
Open

Comments

@AngryScandinavian
Copy link


2018-04-25_16 55 39

Sorry I'm not exactly sure the criteria for posting bugs here but I was told to direct this post to bug reports. Its strange because everyone else here doesnt seem to be seeing this bug, but anyways...
https://www.reddit.com/r/feedthebeast/comments/8eyx4f/thaumcraft_item_inspect_shift_adding_weird/
If I need to post any explicit information here let me know. Thanks.

@NimmerNeko
Copy link
Contributor

its there for me too

@Toksyuryel
Copy link

I get this as well, I thought it was normal though because at this point I'm used to magic mods displaying gibberish for "unknown" things and figured with more research the text would reveal itself.

@Dayyer1
Copy link

Dayyer1 commented Apr 26, 2018

I also have it.

@Cenotaffio
Copy link

i also

@laserlemons
Copy link

Also getting it. I thought it was intentional, like something I wasn't able to read without a certain research.

@AngryScandinavian
Copy link
Author

AngryScandinavian commented Apr 26, 2018

I thought it was some research-y thing too but I’d seen the symbols elsewhere just as the person in the reddit thread explains in detail.

Just some specifications; it seems to happen with Thaumcraft itself without any other mods. I’m going to do a clean install of forge when I’m home to see if there are any hidden settings messing with it and edit this comment later.
Edit: Yep, seems to still be working incorrectly.

@Mike-U5
Copy link

Mike-U5 commented Apr 26, 2018

Oh so that IS a bug. How about that.

@rmunn
Copy link

rmunn commented Apr 27, 2018

I'll repost here what I said in that Reddit discussion:

The combination � is a sign that some Unicode input is getting screwed up somewhere, in two consecutive ways. Specifically, first reading something that was not UTF-8 (probably Latin-1) as if it was UTF-8, and then writing the resulting incorrect UTF-8 into an environment that was expecting Latin-1. Looking at the tables for the Latin-1 code page, we can see that ï has hex value EF, ¿ has hex value BF, and ½ has hex value BD. The hex sequence EF BF BD is UTF-8 encoding for U+FFFD, the Unicode replacement character �. Usually you'll only see that character when something that was encoded with one Unicode encoding is read with a different encoding, and the most common mistake that can trigger this is reading a non-UTF-8 text as if it was really UTF-8. Then the � character got encoded as UTF-8 (where it becomes the three-byte sequence EF BF BD), and some other code read that byte sequence and thought it was Latin-1.

@AnrDaemon
Copy link

And this is a very old bug, present even in TC4 :(

@androshalforc
Copy link

hrmm i thought it was just something i had yet to discover that would make sense later on

@FoxMcloud5655
Copy link

Same here. Interesting.

@Azanor
Copy link
Owner

Azanor commented May 8, 2018

I've actually been trying to track down this bug most of the day. It is indeed a codepage issue but I'm not sure what is causing it.

@AnrDaemon
Copy link

I wonder, what text it is trying to print. Was thinking aspect names, but they appear to be hardcoded unfortunately.

@Geethebluesky
Copy link

Geethebluesky commented May 8, 2018

At the risk of suggesting something really freaking obvious, I've had codepage issues in the past while loading text files simply saved with the wrong UTF encoding (UTF with/without BOM for example), if you haven't tried that it might help?

@AnrDaemon
Copy link

I hate modern IDE's. They are too good for their own good.

Another possible useful bit of information is that these garbage strings are exposed through API.
I.e. Thaumcraft NEI plugin shows them at the aspect information page:

javaw-20180509-014903 75

@rmunn
Copy link

rmunn commented May 9, 2018

Now that I see that second screenshot, I think I've figured it out. The text that's getting mangled is the § character, which is part of Minecraft formatting codes. For example, §l means bold, and §o means italic. So in the first screenshot, the text under the word Waystone has two bold-formatting markers in it, but in one of them, the § (which is A7 in the Latin-1 encoding) has been misread as UTF-8, changing it to � (EF BF BD in UTF-8), so that §l becomes �l (EF BF BD 6C) after the Latin-1 → UTF-8 → Latin-1 roundtrip. And in the second screenshot, §oVacuos becomes �oVacuos (EF BF BD 6F plus the word Vacuous) after the Latin-1 → UTF-8 → Latin-1 roundtrip.

So the place to look in the Thaumcraft code will be in the place that handles Minecraft formatting codes like §l or §o.

@tterrag1098
Copy link

@Azanor This can be caused by unicode text in source files, instead of writing out the '§' character, use Minecraft's TextFormatting enum instead. Or just use unicode escapes.

@rmunn
Copy link

rmunn commented May 17, 2018

This also shows up in this mod spotlight by Direwolf20, where the ° (degree) symbol has been replaced by �. You can see it at 33:42 of that video, or in the partial screen capture below:

degrees

@AnrDaemon
Copy link

 -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8

I always run Minecraft with these settings, to avoid irresponsible mods' that do not set correct encoding on files they read.

@tterrag1098
Copy link

That will likely not solve this bug, as the problems come from data in the .class files themselves. The mod should be using Unicode escapes in the code.

@AnrDaemon
Copy link

Sorry, my head ain't very clear.
That won't solve this particular issue, if anything, it may only make it worse.
My point was that you should not use conversion functions and just use UTF-8 transparently though the code.

@rmunn
Copy link

rmunn commented May 18, 2018

Seeing the ° symbol get corrupted in Direwolf20's video led me to do a bit of hunting. Looking through the en_us.lang file led me to find the focus.scatter.cone string, and searching for the string "focus.scatter" through the .class files in the Beta 13 jar led me to the FocusModScatter.class file. I ran that file through a decompiler to see whether I could spot anything weird, and found that the decompiler produced the following line:

String[] anglesDesc = new String[]{"10\u00ef\u00bf\u00bd", "30\u00ef\u00bf\u00bd", ... }

@Azanor, it looks to me like some step in your compilation process is causing this. I'd bet your original .java files had that string array defined as "10°", "30°", .... But by the time it's compiled to a .class file, the ° symbol has already been turned into an EF-BF-BD sequence. So it looks like this isn't a bug in your code per se, but some misconfiguration in your compilation process: you have a source file that's not UTF-8 (maybe it's Latin-1) and some step is reading it as UTF-8, and then compiling a mangled string into the .class file. (And from what I could see, you might also want to look at RenderEventHandler.java as well.)

(I obtained the beta 13 .jar file directly from https://minecraft.curseforge.com/projects/thaumcraft, BTW, so I know that it's not been modified by someone else. And @Azanor, thank you for allowing the use of decompilers in your license posted on CurseForge, since that allowed me to do this research legally. I hope my findings will be helpful to you in tracking down the problem.)

@Azanor Azanor closed this as completed Sep 25, 2018
@AnrDaemon
Copy link

Ow? So, what was the origin of the issue?

@Azanor
Copy link
Owner

Azanor commented Sep 25, 2018

Oops, this was not supposed to be closed. I've been having odd issues all day with git.

@Azanor Azanor reopened this Sep 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests