Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a 32bpp SSE2 palette animator. #6469

Closed
DorpsGek opened this issue May 25, 2016 · 9 comments
Closed

Add a 32bpp SSE2 palette animator. #6469

DorpsGek opened this issue May 25, 2016 · 9 comments
Labels
flyspray This issue is imported from FlySpray (https://bugs.openttd.org/) patch from FlySpray This issue is in fact a Patch, but imported from FlySrpay

Comments

@DorpsGek
Copy link
Member

JGR opened the ticket and wrote:

Add a 32bpp SSE2 palette animator. When tested this was approximately ~4x faster than 32bpp-anim's palette animator.

Create a new blitter mode: 32bpp-sse2-anim, which is 32bpp-anim + this palette animator.
32bpp-sse2-anim is now used by default where 32bpp-anim would have been.
Also use this palette animator with the 32bpp-sse4-anim blitter.

This changes the alignment requirement of the palette animation buffer and each line within in, which is the reason for the buffer offset changes in other parts of the blitters.

This does not change rendering or other non palette-animation blitter functionality.

Attachments

Reported version: trunk
Operating system: All


This issue was imported from FlySpray: https://bugs.openttd.org/task/6469
@DorpsGek
Copy link
Member Author

DorpsGek commented Sep 2, 2017

andythenorth wrote:

Fails to apply to r27908. I didn't paste all the .rej contents, let me know if you want those.

openttd-trunk(master)$ curl https://bugs.openttd.org/task/6469/getfile/10530/32bpp-anim-sse2-palette-animator.diff | patch -p0
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 20416 100 20416 0 0 106k 0 --:--:-- --:--:-- --:--:-- 106k
patching file source.list
patching file src/blitter/32bpp_anim.cpp
Hunk # 2 FAILED at 39.
Hunk # 3 FAILED at 281.
Hunk # 4 FAILED at 292.
Hunk # 5 FAILED at 305.
Hunk # 6 FAILED at 319.
Hunk # 7 FAILED at 333.
Hunk # 8 FAILED at 347.
Hunk # 9 FAILED at 357.
Hunk # 10 FAILED at 370.
Hunk # 11 FAILED at 401.
Hunk # 12 FAILED at 410.
Hunk # 13 FAILED at 422.
Hunk # 14 FAILED at 436.
Hunk # 15 FAILED at 457.
Hunk # 16 FAILED at 484.
Hunk # 17 FAILED at 515.
16 out of 17 hunks FAILED -- saving rejects to file src/blitter/32bpp_anim.cpp.rej
patching file src/blitter/32bpp_anim.hpp
Hunk # 1 FAILED at 18.
Hunk # 2 succeeded at 62 (offset 4 lines).
1 out of 2 hunks FAILED -- saving rejects to file src/blitter/32bpp_anim.hpp.rej
patching file src/blitter/32bpp_anim_sse2.cpp
patching file src/blitter/32bpp_anim_sse2.hpp
patching file src/blitter/32bpp_anim_sse4.cpp
Hunk # 1 FAILED at 35.
Hunk # 2 FAILED at 353.
2 out of 2 hunks FAILED -- saving rejects to file src/blitter/32bpp_anim_sse4.cpp.rej
patching file src/blitter/32bpp_anim_sse4.hpp
patching file src/blitter/32bpp_base.cpp
patching file src/blitter/32bpp_base.hpp
patching file src/blitter/32bpp_sse_func.hpp
patching file src/gfxinit.cpp
Hunk # 1 succeeded at 284 (offset 18 lines).
patching file src/stdafx.h
Hunk # 1 succeeded at 535 (offset 19 lines).


This comment was imported from FlySpray: https://bugs.openttd.org/task/6469#comment14718

@DorpsGek
Copy link
Member Author

DorpsGek commented Sep 3, 2017

andythenorth wrote:

I tried this in JGR's patchpack where it's included.

In subjective testing, comparing with trunk r27910, using this blitter is substantially faster on OS X (judging by ffwd, for the same savegame in both versions).

There might be other performance reasons why JGR PP is faster than trunk, but this blitter change looks worth reviewing.

I suspect this also eliminates or mitigates #6546.


This comment was imported from FlySpray: https://bugs.openttd.org/task/6469#comment14747

@DorpsGek
Copy link
Member Author

DorpsGek commented Sep 3, 2017

LordAro wrote:

Patch got broken by r27796. This completely supercedes that patch though, so a revert of that lets this patch apply again

Git patch attached. Diff slightly cleaned up, but nothing significant

Attachments


This comment was imported from FlySpray: https://bugs.openttd.org/task/6469#comment14751

@DorpsGek
Copy link
Member Author

DorpsGek commented Sep 3, 2017

andythenorth wrote:

Applied LordAro's patch on trunk r27911. Same subjective-but-obvious speed improvement when using full animation. This is substantially improved on OS X 10.12.6 / 3.3GHz i7.


This comment was imported from FlySpray: https://bugs.openttd.org/task/6469#comment14752

@DorpsGek DorpsGek added Core flyspray This issue is imported from FlySpray (https://bugs.openttd.org/) patch from FlySpray This issue is in fact a Patch, but imported from FlySrpay labels Apr 7, 2018
@TrueBrain
Copy link
Member

Some comments on the patch:

  • Feels that introducing likely/unlikely is a bit out-of-scope for this patch. Might help with performance, but I think that should be another patch.
  • ReallyAdjustBrightness is a silly name. I am sure we can come up with a better one :)

I cannot judge if the SSE2 code is any good, so I believe that on face value :) So just some minor stuff. Sadly, not a PR yet, so a bit hard to comment.

@JGRennison
Copy link
Contributor

The patch as posted above has bug(s) in it which have since been fixed in my patchpack.
I will try to post an updated patch/PR when circumstances permit (may be some time). I can leave out the likely hints, or make those a separate PR or commit in the same PR.
I'm not precious about names, I'm open to suggestions.

@TrueBrain
Copy link
Member

Awesome!

A separate PR for the likely/unlikely sounds perfect :)

Instead of ReallyAdjustBrightness, possible something like: ExecuteAdjustBrightness? AdjustBrightnessInternal? CalculateAdjustedBrightness? Dunno .. something more explaining :)

Tnx!

@frosch123 frosch123 removed the Core label Apr 14, 2018
@PeterN
Copy link
Member

PeterN commented May 22, 2018

Any progress on this? Or should we try cherry-picking from your repo?

JGRennison added a commit to JGRennison/Upstream-OpenTTD that referenced this issue May 23, 2018
Create a new blitter mode: 32bpp-sse2-anim, which is 32bpp-anim + this.
32bpp-sse2-anim is now used by default where 32bpp-anim would have been.
Also use this with the 32bpp-sse4-anim blitter.

See issue OpenTTD#6469.
JGRennison added a commit to JGRennison/Upstream-OpenTTD that referenced this issue May 23, 2018
Create a new blitter mode: 32bpp-sse2-anim, which is 32bpp-anim + this.
32bpp-sse2-anim is now used by default where 32bpp-anim would have been.
Also use this with the 32bpp-sse4-anim blitter.

See issue OpenTTD#6469.
PeterN pushed a commit that referenced this issue May 23, 2018
Create a new blitter mode: 32bpp-sse2-anim, which is 32bpp-anim + this.
32bpp-sse2-anim is now used by default where 32bpp-anim would have been.
Also use this with the 32bpp-sse4-anim blitter.

See issue #6469.
@PeterN
Copy link
Member

PeterN commented May 23, 2018

Implemented!

@PeterN PeterN closed this as completed May 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flyspray This issue is imported from FlySpray (https://bugs.openttd.org/) patch from FlySpray This issue is in fact a Patch, but imported from FlySrpay
Projects
None yet
Development

No branches or pull requests

5 participants