Sign In To Proceed 2z1z44

Don't have an ? 5p1p6t

osu! to create your own !
forum

[Proposal] Re-Organizing the Audio Section of the General Criteria 6fp58

posted
Total Posts
16
Topic Starter
Noffy
After reviewing the current RC, I felt that the current audio section has a lot of different elements blended together. It switches between hitsounding, song compilation rules, and technical file things a few times. It made some topics more difficult to find or reference. I was thinking it could be organized differently.

>>> Click here for my Proposal Document <<< 67301a


If you like commenting on the document directly here is a copy for that.

Do I want proofreading: Yes | Do I want to compress it more if possible: Yes, ideas are welcome.


Also worth discussing that isn't fixed yet in the proposal document: Didn't have any , so will not be pushing the suggestion in the box below. Focus on just the documents for now, thank you c:

removed suggestion
Currently it says:
"beatmaps must be hitsounded ... mania is exempt."

Uhhh taiko doesn't need hitsound additions in the same way as catch and standard do either, so that type of exception is a bit misleading. Like yes taiko is technically using hitsounds under the hood but they're not "adding" to the sound of the map that way, it's literally what key you have to hit.

I was thinking it could change to be
osu! and osu!catch beatmaps must incorporate hitsounding. Hitnormals give to the player, and additions (whistles, claps, and finishes) accent the most important parts of the music.

This could possibly be seperated into the mode-specific RCs for osu! and osu!catch, with some rewording to reflect that. "Hitsounds must be audible" is already a separate rule, so the base level of is still covered for all modes even with this change.

_______________________________________________________

There is no specific timeline on this yet, but if you have any interest any on current audio rc or ideas is welcome.
Bloxi
If a song can not be found in high quality, use the highest quality available without *up-encoding.*

I prefer the old wording about encoding from a lower bitrate since it makes it more obvious what that actually means without prior knowledge?

The reference table is nice but it would be even better if it included pictures of examples of a "good" mp3 and a "bad" mp3 and mention spek bitrates etc.

Beatmaps of songs which are artificially extended must apply spread rules and guidelines based off of the song’s original length, instead of the extended length.

Will potentially run into issues in the future with "Slowed Ver." songs getting more popular with TikTok I'm telling you. The future generation of mappers will dislike this :P

Some two-song combinations are also exempt, as long as the tracks are closely related.

What does this mean?? Examples pls??

Hitsounds must be audible. Hitsounds with low volume or samples that blend with a song's samples are unacceptable. Specific game modes list exceptions to this rule on their respective ranking criteria.

I prefer the old wording with "the purpose is to provide " here, going back to the "Beatmaps must be hitsounded" rule now makes it sound like additions are necessary now for a map to be properly "hitsounded".

As someone who loves doing older style hitsounds without any additions, the normal-hitnormal gives plenty of already and this almost invalidates that interpretation?
Decku
I do genuinely agree that the ranking criteria here does need to be updated a little to be modernized with the changes and overall uniqueness. But in saying that, shouldn't hitsounds be it's own thing then in the general RC like the same as Audio, although it being a type of Audio, it in itself is a big aspect of Ranking Criteria and can either have it's own section, or as suggested its own sub-section in the wiki.

I genuinely feel like the table is useless in this case as well Blaxi. If we REALLY wanted to generally summarize how audio works and examples, an entirely new osu!wiki page should be created in order to accommodate this rather than trying to add more into the General RC. A completely new document explaining the differences between the audio bitrates or better yet, img links to examples would be great to help the overall viewer gain some insight on the spot rather than just clicking on an entirely new link.

Noffy wrote: 5c416y

osu! and osu!catch beatmaps must incorporate hitsounding. Hitnormals give to the player, and additions (whistles, claps, and finishes) accent the most important parts of the music.
About this it would sound a lot more weirder if it was just those two modes. I understand that Taiko is just hitsounds that basically is the map, but in the end of the day it's still hitsounds and shouldn't be lenient based on the relativity of the music. Mania is different for the reasons we all know and should be kept that way.
The hitsounding section can always be applied to osu!taiko's ranking criteria if it hasn't already been placed in it.

I did ask Irone OSU about it and this is what they stated:


I do believe everything else seems okay. But in general messing around with the arrangement of the General RC can always come in with some problems.
Okoayu
welcome Blaxi, Bloxi's evil twin

- change do not beatmap to `do not map`
- ensures instant ... should be positioned to apply to both previous points

Deleted combinations of 2 songs should be clearly related as all the exceptions with regards to draintime rules are already handled with marathons & compilations
Topic Starter
Noffy
Went through this a little bit with Okoayu as well while looking at the so far, updated the documents.


A comment on the google doc brought up:

Regarding audible distortions in audio ->
"What if intended distortions and such (or even intended loudness) breaks the content guidelines for being "excessively loud / obnoxious"?"

If it breaks the guidelines, that's up to s to comment on. However Oko took some time to reword this guideline for better clarity now.

Updated other things based off the , and won't go through changing the otcm hitsounding mode-specific rule brought up in my forum post.


About adding more to the reference table ->
I can remove it or make any other changes, but I don't think I can add spek or khz guidelines to this. The reason why is I tried before and I was very strongly told no. See community/forums/topics/923648?n=22 .
Topic Starter
Noffy
- Decided to remove reference table, may be better for a guide somewhere else.

- Added a new requirement based on community/forums/topics/1768985?n=1
- Added a new requirement for sampling rate, as higher values will cause lazer to shit bricks.
Leviathan

Noffy wrote: 5c416y

About adding more to the reference table -&gt;
I can remove it or make any other changes, but I don't think I can add spek or khz guidelines to this. The reason why is I tried before and I was very strongly told no. See community/forums/topics/923648?n=22 .
then why is there a rule against encoding upwards when theres no information on what that is and how to tell? no average person is going to tell just by listening unless its obviously incredibly low (in fact 99% of bloated audio files are because of bad youtube rips and people not being able to tell because they sound fine to them) and i dont think having to rely on people who know more is a good thing -.-
Topic Starter
Noffy

Leviathan wrote: 5k3h2u

Noffy wrote: 5c416y

About adding more to the reference table -&gt;
I can remove it or make any other changes, but I don't think I can add spek or khz guidelines to this. The reason why is I tried before and I was very strongly told no. See community/forums/topics/923648?n=22 .
then why is there a rule against encoding upwards when theres no information on what that is and how to tell? no average person is going to tell just by listening unless its obviously incredibly low (in fact 99% of bloated audio files are because of bad youtube rips and people not being able to tell because they sound fine to them) and i dont think having to rely on people who know more is a good thing -.-
Well
1. Don't encode upwards: Same ideals of saving on filesize as many other optimization related rules

2. Not adding that guidance: Better suited for guides than RC. RC gets taken as law, so even things written to be helpful can be misunderstood as very strict if it's included. Hence why my original table was only summarizing rules, not adding information. Guides like community/forums/topics/1731264?n=1 or https://railgun.site/guides/audio/ exist in a simple search too.

3. Many common visuals for spektro is based of LAME encoded mp3s. While that's the most common method, mp3s encoded with other methods may lead to entirely different spectograms, so ears are the best check.

Not that I am keen on the current mixture, but I also don't want to literally do the exact things I've already been told not to do before when other resources exist too.
SupaV
There seems to be a lot of redundancy, where the same clause (or similar) is often repeated. Not sure if this is intended, but even if intended, I recommend just keeping one line in the more important category (i.e. if it's already in "rules" don't repeat or split it in "guidelines")

  1. Minimum average bitrate: 128kbps, unless the best available audio is of lower quality.
  2. The audio file of a song should not be artificially extended. Examples include:
Removing the "minimum average bitrate" clause and just integrating it with the latter statement would be better as the rule "asks to find the best quality audio possible", and anything below 192kbps already echoes the sentiment that "a better audio can't be found, so the best is already used."

  1. Beatmaps of songs which are artificially extended must apply spread rules and guidelines based off of the song’s original length, instead of the extended length.
  2. If a song can not be found in high quality, use the highest quality available without encoding upwards from a lower bitrate.
The way stuff is worded around confused me when reading cause the first clause says it's fine, but the second one says it's not. It would be better to put them all on the same clause as such:

The audio file of a song should not be artificially extended. Examples include:
...
Exemptions to the rule above include
- Extensions of songs less than 30s...
- Two-song combinations
This way there's a clear guideline and exceptions instead of pulling them into two different categories.

  1. Some two-song combinations are also exempt, as long as the tracks are closely related. Examples include: being iterations of the same series of songs, related in lyrics or motifs, similar in tone and/or genre, etc.
  2. Combinations of 2 songs should be clearly and closely related.
I'd like to question the intention of the "combination of two songs" statement here since the two different statements feel very vague. The first statement is super route-y and vague, while the second is direct.

If the intention is to allow songs like Shiorigoto to be mixed, then this should follow the spread rules and guidelines of the original song. But if this is the intention, it would make songs that are meant to be listened together/sequential in an album which I believe is the more common use case, i.e. DGD 1, DGD 2, etc. the rule should be made clearer.

I think there should be some discussion of songs that are meant to be listened to together/sequential and continuously in an album and combined together should share the same spread rules or not cause it isn't explicitly implied in this proposal.

If you have hit objects within the first 150ms of a beatmap, you should add additional silence to the beginning of the audio file. Beatmaps with objects too early may experience noticable performance drops at the start of a beatmap.
Correct me if I'm wrong, but if this rule is to tackle the 0ms audio offset issue? If that's the case, I don't think 150ms should be used as the standard unless there is a factual observation that 150ms is the bare minimum, not 50, not 20, not anything after zero.
Visionary
Maximum Sampling Rate: 48,000Hz. 44,100Hz is also very common for music and fine to use when available.
good idea

If a song can not be found in high quality, use the highest quality available without encoding upwards from a lower bitrate.
should probably add to be "If a song can not be found in high quality, use the highest quality available without encoding upwards from a lower bitrate or sample rate."

An active hitsound’s file must...

...have a clear impact, with no more than 5ms delay before the sound’s peak. normal-hitfinish.wav from the default skin is exempt from this.
...use the uncompressed WAV (.wav) or Ogg Vorbis (.ogg) file format. MP3 must not be used as it is inherently delayed.
it should be that every object has at least one hitsound file that satisfies this. this takes away from the ability to do some cool stuff like keysounding a bass with it pseudo-sidechained to a kick. as long as one sample exists, it gives sufficient . current one is "All clicked parts of objects must have at least one hitsound which both...", just keep this

Gameplay sounds excluding active hitsounds should use the MP3 (.mp3) or Ogg Vorbis (.ogg) file format. These files usually have long durations and uncompressed WAV (.wav) files are unnecessarily large in comparison, however uncompressed WAV should be used when it results in a smaller file size.
if a wav file is smaller than an mp3 file it means the quality is going to be incredibly bad, since wav files contain the raw data of the audio. likely just remove the last section

Avoid using samples identical to the song. This can cause issues with phase cancellation, causing the hitsound to seem distorted or inaudible
not sure how u would phrase this but this seems like smth that needs to be added

Check this by listening to the audio rather than using software on its own.
too many people are used to spek and stuff atp (even when it causes them to say perfectly fine audio is overencoded lol) so probably change this to the current rc "This is best determined by listening to the audio, rather than using software on its own."
Molybdenum
"A preview point must be set. This is used for both song select and the online previews.
The preview point must be consistent between all difficulties."

These can probably be merged into one statement.

"A preview point must be set and consistent across all difficulties. This is used for both song select and the online previews." maybe
Leomine
Hi, HS rules and guidelines check :)

Storyboard sound effects cannot be used as replacements for active hitsounds. These give an inaccurate form of player . Storyboard sound effects in other situations are acceptable, but discouraged. osu!mania is exempt from this rule.
This rule had to be explained in the general rules instead of another github file. A lot of maps have this issue cause not everyone see this section considering this is at the ending of the RC page, it would be a good thing if this would be added in RC Audio as well.

Avoid replacing the hit finish in soft/normal samplesets with frequently used custom hitsound samples. Using these finishes to represent snare/bass drums or a song's melody can sound obnoxious for anyone disabling beatmap hitsounds. Replacing hit whistles/claps is recommended because those samples are used more often. osu!taiko beatmaps are exempt from this guideline and have their own mode-specific hitsound sample guideline.
This never been so clear as it expect, general pov is to avoid the cymbal change between Normal and Soft addition, which i see a lot of maps fail to comply with this guideline, so why is there this script on the rules although ppl don't respect it? It would lead to understand also "avoid to use finish addition as snare/kick" and this is fine if it weren't for the fact that the main guideline title expresses to "not use them during a frequent custom sampleset change".
i would suggest to express it in another way

Avoid replacing the hit finish in soft/normal with other addition custom hitsounds. In default osu! skin, using these finishes to represent snare/bass drums or a song's melody can sound obnoxious for anyone disabling beatmap hitsounds. Replacing hit whistles/claps is recommended because those samples are used more often. osu!taiko beatmaps are exempt from this guideline and have their own mode-specific hitsound sample guideline.

The rest is fine from my point of view, hope my message could be productive
Topic Starter
Noffy
It was mentioned a few times here and elsewhere for the sampling rate rule:

We shouldn't have a clause about upsampling specifically. I think earlier this week when adding sampling rate limits, it ended up kind of thrown in there reflecting bitrate rules out of habit.

I think the previous rules for low quality audio in high quality trenchcoats already covers what is needed for it.

The reason why is adding a comment for sampling rate upsampling leads people to worry about 44.1->48 conversions which... isn't really necessary to worry about?

One way in higher quality audio to let you tell the difference would be the Nyquist limit, which would normally be half the sampling rate. The thing is, the typical cut off for 192 mp3s on popular encoders is a bit lower than where just the sampling rate would set the limit. There is no reliable and easy way I know of to check if an 192kbps mp3 audio file was upsampled from 44.1khz to 48khz ... And since the size is from bitrate, a 48khz at 192kbps and 44.1khz mp3 at 192kpbs will sound basically the same for our needs and be the same filesize.

Without a tool to readily know when it's happening and without a pressing reason to include it, I think it does more harm than good to include any explicit comments on upsampling because it's just confusing of "how would we know" or "how is it checked" or "is it relevant to actually include" when quality is not why we needed to add a sampling rate comment to start with (96khz ogg file causing breakage in lazer).



===================================================================================================


@supav
The first part of your post was ittedly confusing to me, so I'm not sure I fully understood your proposed fix regarding redundancy.

But i did go ahead and update the description for song combinations.

on the last one, it's not the 0ms offset issue. there are consistent lag spikes if objects are in the first ~150ms of a beatmap. because it's a lag spike it's hard to measure an exact number, but this is what oko helped to observe.

@visionary

Since active hitsounds are separately defined on the wiki, we can link it to the defining rc page with a proper PR I think

Will be removing the .wav allowance for gameplay sounds as I agree and never seen a case of it actually being smaller since we added this rule. moving it to rules since it would be stricter. I'm also making it more explicit since "gameplay sounds except..." was too ambiguous to many people I've discussed this with over the years.

Are there any examples of phase cancellation that came up in a map going for ranked? that has to be, like, ms-perfect to happen and i've never heard of such a thing happening in osu. Usually it ends up a problem on whether it's audible because it's too similar, wouldn't it?

@molybdenum
will revise

@leomine
not sure if your point for sb sfx was ing the change or suggesting something else be done with the proposal.

for finishes: I personally haven't really seen people breaking the existing guideline much? If anything MV should be less sensitive with its check for this, because most times it mentions it I think it's a false-positive and it still sounds fine / doesn't break the guideline. But I'm not an MV developer.

This point has nothing to do with how often the sampleset changes, just how often the finish slot is used.

As suggested I've tried out re-wording this to make it more direct.
Visionary
There is no reliable and easy way I know of to check if an 192kbps mp3 audio file was upsampled from 44.1khz to 48khz
yes, but there is also no reliable way to check if the bitrate has been upsampled. despite what everyone with spek ed will tell you, not every song is mixed and mastered the same, and the 19kHz cutoff is not a reliable way to tell if an audio has been overencoded. without a doubt, the best way to check this is to just source a lossless file of the song and compare the signals, not apply some heuristic that does not apply to 90% of genres that are not as commonly seen. if upsampling from lower bitrate is a problem because of bloated filesize, then upsampling from lower samplerate is the same issue.
And since the size is from bitrate, a 48khz at 192kbps and 44.1khz mp3 at 192kpbs will sound basically the same for our needs and be the same filesize.
i absolutely agree that most people won't be able to tell the difference, but in reality you will still have an upsampled signal. the reason the rule is there in the first place is to ensure the audio is not bloating filesize, and regardless of how much of a difference it seems like, 44.1->48 is going to lead to a bloated audio file. if we go off of "sounds the same", most people can't even tell the difference between a 128kbps 44.1 and 320kbps 44.1. honestly i think the only time this would ever need to be checked is if the file is 48kHz, because it's highly unlikely you would have a lower samplerate than 44.1kHz unless it's some old vgm.

Are there any examples of phase cancellation that came up in a map going for ranked? that has to be, like, ms-perfect to happen and i've never heard of such a thing happening in osu. Usually it ends up a problem on whether it's audible because it's too similar, wouldn't it?
yes actually, and the polarity doesn't even need to be reversed!

i can't all maps that i've seen where samples were ripped directly from the song (or had the same source), but an example i know of is kiddly's neurotoxin (and it becomes unbearable to play with hitsounds)

no, they don't need to be ms-perfect, they can be slightly offset by a few ms in either direction (as would be the case when playing) and still cause noticeable distortion. i made some examples here:
* https://visionaryww.s-ul.eu/kpH1ds1F
* https://visionaryww.s-ul.eu/lgVWqUaC
SupaV
@noffy
yeah my bad, the copying thing messed up, to reiterate:

1)
- Minimum average bitrate: 128kbps, unless the best available audio is of lower quality.
- If a song can not be found in high quality, use the highest quality available without encoding upwards from a lower bitrate.

I asked if there would be a possibility to merge those two clauses together? This is since the statement "use best quality audio" is seemingly repeated twice here.

IIRC on the previous iteration they were a distance apart from each other, so to see them closer in this iteration sounds better

---

The song combinations part seem much better than before and sound clearer instead of being a mess like the previous iteration. Though, I'd like to ask about "Combinations of 2 songs should be clearly and closely related." still being there. Realistically, you've already established that sentiment in the "Rules" part:
"...Some two-song combinations are also exempt, as long as the tracks are closely related..."

Some two-song combinations are also exempt, as long as the tracks are closely related. Examples include: being sequential in a song series, being designed to flow into each other, related in lyrics or motifs, similar in tone and/or genre, etc.
So about this clause, IIRC the previous iteration allows songs that are sequential in an album/designed to flow into each other are allowed, but not the other two? What's with the sudden change, or am I ing wrong?

Current clause somewhat implies that a person can put two songs together and call it a day no?
Okoayu
Noffy & I discussed the two song combination thing yesterday; I think it's good to have in either Spread or Audio. Repeating it can't really hurt but it kinda is both so it's a bit weird

I think we loosely settled on just deleting it from audio if it ends up in spread with the examples provided or something i dont 100%
Please sign in to reply.

New reply 3p1g1j